dnanexus / wdlTools

WDL tools for parsing, type-checking, and more
Apache License 2.0
25 stars 7 forks source link

Use of reserved keyword as name of input/output results in an uninformative error message #169

Open dolevrahat opened 3 years ago

dolevrahat commented 3 years ago

Hello

I have encountered the following problem:

I have a workflow that contains (among others) the two tasks GetBwaVersion and MergeBamAlignment. MergeBamAlignment takes as an input the output of GetBwaVersion.

task GetBwaVersion{
  input{
  String docker_image
  String bwa_path
  }
  command {
    # Not setting "set -o pipefail" here because /bwa has a rc=1 and we don't want to allow rc=1 to succeed
    # because the sed may also fail with that error and that is something we actually want to fail on.
    ~{bwa_path}bwa 2>&1 | \
    grep -e '^Version' | \
    sed 's/Version: //'
  }
  runtime {
    docker: docker_image
    dx_instance_type: "mem1_ssd1_v2_x2"
  }
  output {
    **String bwa_version = read_string(stdout())**
  }
}
  call MergeBamAlignment {
      input:
        unmapped_bam = FastqToUnmappedBam.unmapped_bam,
        bwa_commandline = bwa_commandline,
        index_prefix = SamToFastqAndBwaMem.bwa_index_prefix,
        **bwa_version = GetBwaVersion.version,**
        aligned_bam = SamToFastqAndBwaMem.output_bam,
        output_bam_basename = base_file_name + ".aligned.unsorted",
        ref_fasta = ref_fasta,
        ref_fasta_index = ref_fasta_index,
        ref_dict = ref_dict,
        docker_image = gatk_docker,
        gatk_path = gatk_path,
        compression_level = compression_level
    }

}

When I attempted to compile this workflow I got the following error message:

wdlTools.syntax.SyntaxException: error parsing document /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
Caused by: wdlTools.syntax.SyntaxException: missing identifier at 122:4-122:25 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl

As it turned out (after considerable frustration), the cause of this error is that version is apparently a reserved keyword in WDL, and thus naming the output of GetBwaVersion version resulted in the parser failing to find an identifier for this output.

I see three issues that make this difficult to debug: 1. Uninformative error message: The way this error message is phrased, it is very difficult to infer that the problem is the use of a reserved keyword. It would have been very helpful had the parser indicated this explicitly. 2. Lacunae in the documentation: I was unable to find any mention of the fact that version is a reserved keyword. It does not appear in the list of reserved keywords in the WDL specfications. I infer that it is a reserved keyword based on the fact that replacing the output name from version to _bwaversion resulted in a successful compilation of the workflow, and also from the following clause in the wdlTools code:

it should "version is a reserved keyword" in {
    assertThrows[Exception] {
      val _ = getDocument(getWorkflowSource("call_bug2.wdl"))
    }

(source: https://github.com/dnanexus-rnd/wdlTools/blob/682efa4151c17dbeff627832d7f491df0436fc5f/src/test/scala/wdlTools/syntax/v1/ConcreteSyntaxV1Test.scala) 3. Inconsistent error messages between task and call: At same point I switched the name of the task output to _bwaversion but forgot to change it in the call to MergeBamAlignment. This resulted in the following cryptic error message:

wdlTools.syntax.SyntaxException: error parsing document /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
Caused by: wdlTools.syntax.SyntaxException: token recognition error at: ',' at  at 92:43-92:43 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
token recognition error at: '\n' at  at 92:44-92:44 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
token recognition error at: '_' at  at 93:15-93:15 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
no viable alternative at input ',\n        bwa_version = GetBwaVersion.version' at version at 92:36-92:36 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
extraneous input '=' expecting Identifier at = at 92:20-92:20 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
mismatched input '.' expecting '=' at . at 92:35-92:35 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
extraneous input '=' expecting Identifier at = at 93:20-93:20 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
mismatched input '.' expecting '=' at . at 93:41-93:41 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
extraneous input ',' expecting Identifier at , at 93:52-93:52 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
extraneous input ',' expecting {'scatter', 'call', 'if', 'input', 'output', 'parameter_meta', 'meta', 'Boolean', 'Int', 'Float', 'String', 'File', 'Array', 'Map', 'Pair', 'Object', RBRACE, Identifier} at , at 94:66-94:66 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
extraneous input '=' expecting Identifier at = at 95:18-95:18 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
mismatched input ',' expecting {'scatter', 'call', 'if', 'input', 'output', 'parameter_meta', 'meta', 'Boolean', 'Int', 'Float', 'String', 'File', 'Array', 'Map', 'Pair', 'Object', RBRACE, '[', '<', '>', '>=', '<=', '==', '!=', '&&', '||', '*', '+', '-', '.', '/', '%', Identifier} at , at 95:29-95:29 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
extraneous input '=' expecting Identifier at = at 96:24-96:24 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
mismatched input ',' expecting {'scatter', 'call', 'if', 'input', 'output', 'parameter_meta', 'meta', 'Boolean', 'Int', 'Float', 'String', 'File', 'Array', 'Map', 'Pair', 'Object', RBRACE, '[', '<', '>', '>=', '<=', '==', '!=', '&&', '||', '*', '+', '-', '.', '/', '%', Identifier} at , at 96:41-96:41 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
extraneous input '=' expecting Identifier at = at 97:17-97:17 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
mismatched input ',' expecting {'scatter', 'call', 'if', 'input', 'output', 'parameter_meta', 'meta', 'Boolean', 'Int', 'Float', 'String', 'File', 'Array', 'Map', 'Pair', 'Object', RBRACE, '[', '<', '>', '>=', '<=', '==', '!=', '&&', '||', '*', '+', '-', '.', '/', '%', Identifier} at , at 97:27-97:27 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
extraneous input '=' expecting Identifier at = at 98:21-98:21 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
mismatched input ',' expecting {'scatter', 'call', 'if', 'input', 'output', 'parameter_meta', 'meta', 'Boolean', 'Int', 'Float', 'String', 'File', 'Array', 'Map', 'Pair', 'Object', RBRACE, '[', '<', '>', '>=', '<=', '==', '!=', '&&', '||', '*', '+', '-', '.', '/', '%', Identifier} at , at 98:34-98:34 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
extraneous input '=' expecting Identifier at = at 99:18-99:18 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
mismatched input ',' expecting {'scatter', 'call', 'if', 'input', 'output', 'parameter_meta', 'meta', 'Boolean', 'Int', 'Float', 'String', 'File', 'Array', 'Map', 'Pair', 'Object', RBRACE, '[', '<', '>', '>=', '<=', '==', '!=', '&&', '||', '*', '+', '-', '.', '/', '%', Identifier} at , at 99:29-99:29 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
extraneous input '=' expecting Identifier at = at 100:26-100:26 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
mismatched input '}' expecting {'scatter', 'call', 'if', 'input', 'output', 'parameter_meta', 'meta', 'Boolean', 'Int', 'Float', 'String', 'File', 'Array', 'Map', 'Pair', 'Object', RBRACE, '[', '<', '>', '>=', '<=', '==', '!=', '&&', '||', '*', '+', '-', '.', '/', '%', Identifier} at } at 101:4-101:4 in

The line numbers in the error message correspond to the call to MergeBamAlignment, but suggest a localized syntax problem such as a misplaced comma/newline and not an inconsistency between the name of the output of the task definition and its name in the call. I assume that this is also caused by the use of a reserved keyword. To test this, I changed the output name in the task to _bwaversion and set the input of the call to bwa_version = GetBwaVersion.bwa_version2 Which resulted in the much more informative error message:

wdlTools.types.TypeException: Call GetBwaVersion does not have output bwa_version2 at 92:22-92:35 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
argument to parameter 'bwa_version' has type T_Call(GetBwaVersion,TreeSeqMap(bwa_version -> T_String)),  it is not coercible to T_String at 87:4-101:5 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/mva_dxcompiler.wdl
jdidion commented 3 years ago

Thank you for pointing this out. We will fix #2 in the next version of the WDL spec. For the other two points, this is a major downside of the ANTLR4 parser. I'll keep looking for a way to improve the error message.