Closed tmdefreitas closed 6 years ago
This is happening because the the command {...}
section is parsed differently than the rest of WDL. The parser thinks that the closing brace in #echo {an_input}
is actually trying to close the command
section. If you use the alternative delimiters (command <<< ... >>>
) this is another way to get around it.
We parse the command as "opaque strings intermixed with ${...}
blocks". That means that the #
-style comments inside a command section are not interpreted as WDL comments, but instead as part of the command. More thought would have to go into figuring out what the right thing to do here is.
Admittedly, I have never written a parser before, so I don't know how feasible this is, but can you write the grammar/parser such that everything on a line after a #
character is ignored?
The only edge cases I imagine are when the #
character is in a quoted string.
@tmdefreitas Yes, that is definitely possible.
However, we try to not make assumptions about the type of characters that your script can have in it. I'm perhaps being a little overly cautious, but I'd hate for there to be a case where somebody wants to use a #
in their command but it gets interpreted as a comment. That could lead to the same kind of confusion that we're seeing now.
I vacillate on this because I also see the pragmatism in implementing your suggestion for the common case. In most cases I can think of, a #
is a comment
Maybe some approach like Eddie's where I can have the parser give a better error message is the best solution.
Scott,
Not only can I imagine a command where # is used (and in a WDL no less), but I have actually written such a command.
In the merge step for mutect, I use "#" to select for header lines in merging call_stats files and VCF files!
command <<<
set -x
MUTECT1_CS="MuTect1.call_stats.txt" head --lines=2 ${mutect1_cs[0]} > $MUTECT1_CS cat ${sep =' ' mutect1_cs} | grep -Pv '#'|grep -Pv '^contig' >> $MUTECT1_CS
MUTECT2_CS="MuTect2.call_stats.txt" cat ${mutect2_cs[0]} |grep -P '^#' > $MUTECT2_CS ; cat ${sep=' ' mutect2_cs} |grep -Pv '^#' >> $MUTECT2_CS ; -eddie
On Wed, Mar 23, 2016 at 3:25 PM, Scott Frazer notifications@github.com wrote:
@tmdefreitas https://github.com/tmdefreitas Yes, that is definitely possible.
However, we try to not make assumptions about the type of characters that your script can have in it. I'm perhaps being a little overly cautious, but I'd hate for there to be a case where somebody wants to use a # in their command but it gets interpreted as a comment. That could lead to the same kind of confusion that we're seeing now.
I vacillate on this because I also see the pragmatism in implementing your suggestion for the common case. In most cases I can think of, a # is a comment
Maybe some approach like Eddie's where I can have the parser give a better error message is the best solution.
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/broadinstitute/wdltool/issues/7#issuecomment-200505343
Edward A. Salinas Senior Software Engineer - Getz Lab Broad Institute of MIT and Harvard 75 Ames Street, Rm 4013 Cambridge, MA 02142 Ph: 617-714-7905 Cell: 210-274-3172
@eddiebroad But those are all quoted strings, and don't look the same as a WDL comment. From an implementation perspective, doesn't cromwell pipe the command block to /bin/bash anyway? And following bash rules unquoted #
characters start a comment, so maybe WDL just has to follow the same comment parsing rules as bash?
The # in my command is NOT intended to be as a comment, but is used in the grep command because certain header lines start with # in VCF files.
WDL following bash-like comment rules might be a good idea.
-eddie
On Wed, Mar 23, 2016 at 3:35 PM, Timothy DeFreitas <notifications@github.com
wrote:
@eddiebroad https://github.com/eddiebroad But those are all quoted strings, and don't look the same as a WDL comment. From an implementation perspective, doesn't cromwell pipe the command bock to /bin/bash anyway? And following bash rules unquoted # characters start a comment, so maybe WDL just has to follow the same comment parsing rules as bash?
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/broadinstitute/wdltool/issues/7#issuecomment-200510863
Edward A. Salinas Senior Software Engineer - Getz Lab Broad Institute of MIT and Harvard 75 Ames Street, Rm 4013 Cambridge, MA 02142 Ph: 617-714-7905 Cell: 210-274-3172
I'm hesitant to try to follow the same rules as Bash for parsing a command because I worry that it'll be error prone and difficult to pin down the details. It feels like we'd essentially reimplement a Bash parser in WDL. It'd be difficult and hard to get the nuances of something like this:
command {
python <<CODE
print("#octothorps!!#")
CODE
}
Maybe something as simple as syntax highlighting could help too... if #
s are not highlighted differently in the command section then that could also be a hint. Granted, I know syntax highlighters won't always be used.
If it's clear to WDL users how # is interpreted (or not interpreted) in the command-block maybe that would be the best thing? Would a sentence/blurb in the docs address this? Possibly as mentioned earlier good error messages?
-eddie
On Wed, Mar 23, 2016 at 3:43 PM, Scott Frazer notifications@github.com wrote:
I'm hesitant to try to follow the same rules as Bash for parsing a command because I worry that it'll be error prone and difficult to pin down the details. It feels like we'd essentially reimplement a Bash parser in WDL. It'd be difficult and hard to get the nuances of something like this:
command { python <<CODE print("#octothorps!!#") CODE }
Maybe something as simple as syntax highlighting could help too... if #s are not highlighted differently in the command section then that could also be a hint. Granted, I know syntax highlighters won't always be used.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/broadinstitute/wdltool/issues/7#issuecomment-200514792
Edward A. Salinas Senior Software Engineer - Getz Lab Broad Institute of MIT and Harvard 75 Ames Street, Rm 4013 Cambridge, MA 02142 Ph: 617-714-7905 Cell: 210-274-3172
Issue moved to broadinstitute/cromwell #2870 via ZenHub
I came across this while helping a colleague debug her WDL file. When this WDL file is validated, wdltool claims an "Error: finished parsing without consuming all tokens", even though the error is commented out:
EDIT: error message, for completeness:
I can get rid of the error by changing the comment line to
#echo ${an_input}
. I think errors in comment lines should probably be ignored by the validator.As an aside, is there a more helpful error message for this? The message sounds like an unused input variable or something, not that the bracket syntax was off, so it was hard to diagnose (The above is a toy example, the real task had a much more complicated command).