edmundmiller / nextflow-mode

Emacs major mode for Nextflow
GNU General Public License v3.0
20 stars 7 forks source link

Support Syntax highlighting in scripts #15

Open edmundmiller opened 1 year ago

edmundmiller commented 1 year ago

I think this issue says it all: https://github.com/nextflow-io/vscode-language-nextflow/issues/7

Could probably use https://polymode.github.io/

yqshao commented 10 months ago

Hi! I don't know if this is being worked on, but if not I am interested to implement. I get some initial attempts, but since I am a bit new to elisp I hope it's OK to discuss here (if a PR is preferred I can also work on it).

My attempt with polymode and regex matchers:

(use-package nextflow-mode
  :straight (:host github :repo "emiller88/nextflow-mode")

;; mostly https://polymode.github.io/defining-polymodes/
(use-package polymode)
(define-hostmode poly-nextflow-hostmode
  :mode 'nextflow-mode)
(define-auto-innermode poly-nextflow-script-innermode
  :head-matcher 
  (cons "^ *\"\"\" *\n *\\(#!/usr/bin/\\(?:env *\\)?[[:alpha:]]+.*\n\\)" 1)
  :tail-matcher "^ *\"\"\" *$"
  :mode-matcher (cons "#!/usr/bin/\\(?:env *\\)?\\([[:alpha:]]+\\)" 1)
  :head-mode 'host
  :tail-mode 'host)
(define-polymode poly-nextflow-mode
  :hostmode 'poly-nextflow-hostmode
  :innermodes '(poly-nextflow-script-innermode))

;; usage as `:hook' will not work:
;; https://github.com/polymode/polymode/issues/324#issuecomment-1737957679
(add-to-list 'auto-mode-alist '("\\.nf\\'" . poly-nextflow-mode))

;; perhaps not needed but to surpress spurious flycheck warnings, see:
;; https://github.com/polymode/poly-org/issues/3#issuecomment-800769270
(defun flycheck-buffer-not-indirect-p (&rest _)
  "Ensure that the current buffer is not indirect."
  (null (buffer-base-buffer)))
(advice-add 'flycheck-may-check-automatically
            :before-while #'flycheck-buffer-not-indirect-p)

My test script:

#!/usr/bin/env nextflow

process perlTask {
    input: val(name)

    """
    #!/usr/bin/env perl
    print 'Hi there, $name!' . '\n';
    """
}

process pythonTask {
    """
    #!/usr/bin/python
    x = 'Hello'
    y = 'world!'
    print("%s - %s" % (x,y))
    """
}

workflow {
    perlTask('you')
    pythonTask()
}

My thoughts:

  1. I did not manage to get it to work for "default" bash scripts without shebang. It seems that polymode do not work if a chunk matches both as head and tail. A better approach is perhaps to start matching from the process definition?
  2. If one use a regex to extract the mode, the """ part cannot be in the head, I suspect it has to do with polymode not handling multiline matches; since we should want to extract the mode from shebang anyways, maybe it's better to use a function like that in files.el?
  3. The above seems to work (one gets the correct mode for blocks with shebang), but I also notice some unexpected switching of syntax highlighting and indentation behaviour when moving between nextflow and innner modes, but I have yet to understand what to look at...
edmundmiller commented 10 months ago

Hey! First off I'd love for you to take a stab at that! I get by pretty well since I try to just stick to bash scripts in the process definition.

If it works with the shebang type of scripts, I'm happy to merge that and we can make it opt-in!

Just a thought to address 2 though, I usually use a script: directive in processes. Could you try matching off that, and if there's no shebang default to sh mode? (shell: could be included as well, but script: should suffice for a PoC!)

ewels commented 8 months ago

Hi @edmundmiller , @yqshao,

I was wondering if you had any updates on this effort? I'd love to get it working. Let me know if there's anything that we can do to help..

Phil

yqshao commented 8 months ago

Hej! Sorry fot the slow progress, and for spamming the issue log (I was force pushing a lot). I'm in transition between positions and was distracted etc. I should have time back to this, next weekish.

So far I made it to work for scripts if 1) it has a shebang, or 2) it's in a shell|script|exec block. I feel that's the furthest I can go with the regex matcher (might be improved if one can make better use of a syntax tree). To test, it below is a minimal example config.

(use-package polymode)
(use-package nextflow-mode
  :straight (:host github :repo "yqshao/nextflow-mode")
  :init (setq nextflow-enable-polymode t))
;; or just set the variable before requiring

I am not so experience in emacs packaging, and I won't mind if one take over. But when I get the time I'll be happy to figure those (how to work with CI, dependencies and customization) out.

edmundmiller commented 8 months ago

@ewels I think we really need to just invest in a tree-sitter parser. It has support for multi-language-documents. Had some conversation about it with @bentsherman in the prettier channel.

It could get hooked into https://topiary.tweag.io/.

Figured this all might be in your ballpark as these would be nice Open source projects for Nextflow.