ENCODE-DCC / croo

Cromwell output organizer
MIT License
13 stars 3 forks source link

TypeError: unsupported operand type(s) for +: 'NoneType' and 'int' #46

Open jaavedm opened 1 month ago

jaavedm commented 1 month ago

Hello,

I'm getting an error when trying to run croo after a successful run of the encode rna-seq-pipeline. My version of croo is 0.6.0

The command I ran is croo /storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/metadata.json --out-dir /storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/ --ucsc-genome-db hg38 --method copy

And the complete output and error I am getting is:

2024-05-06 18:29:43,756|autouri.autouri|INFO| cp: (6dd3a2ac) started. src=https://storage.googleapis.com/encode-pipeline-output-definition/bulkrna.output_definition.json, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/.croo_tmp/5a45d485e265dcac3f19d13aead3c399/bulkrna.output_definition.json
2024-05-06 18:29:43,820|autouri.autouri|INFO| cp: (6dd3a2ac) done.
2024-05-06 18:29:43,821|autouri.autouri|INFO| cp: (40bb08c6) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-alignh/shard-0/execution/rep1REP0_ESC_WT_anno_flagstat.txt, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep1/rep1REP0_ESC_WT_anno_flagstat.txt
2024-05-06 18:29:43,833|autouri.autouri|INFO| cp: (40bb08c6) done.
2024-05-06 18:29:43,834|autouri.autouri|INFO| cp: (8b7bc59d) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-1/execution/rep2REP0_ESC_WT_anno_flagstat.txt, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep2/rep2REP0_ESC_WT_anno_flagstat.txt
2024-05-06 18:29:43,842|autouri.autouri|INFO| cp: (8b7bc59d) done.
2024-05-06 18:29:43,842|autouri.autouri|INFO| cp: (0dd09dca) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-0/execution/rep1REP0_ESC_WT_anno_flagstat.json, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep1/rep1REP0_ESC_WT_anno_flagstat.json
2024-05-06 18:29:43,859|autouri.autouri|INFO| cp: (0dd09dca) done.
2024-05-06 18:29:43,860|autouri.autouri|INFO| cp: (e7048b10) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-1/execution/rep2REP0_ESC_WT_anno_flagstat.json, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep2/rep2REP0_ESC_WT_anno_flagstat.json
2024-05-06 18:29:43,869|autouri.autouri|INFO| cp: (e7048b10) done.
2024-05-06 18:29:43,869|autouri.autouri|INFO| cp: (b6bad4e8) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-0/execution/rep1REP0_ESC_WT_anno.bam, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep1/rep1REP0_ESC_WT_anno.bam
2024-05-06 18:30:02,603|autouri.autouri|INFO| cp: (b6bad4e8) done.
2024-05-06 18:30:02,604|autouri.autouri|INFO| cp: (33ec02d3) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-1/execution/rep2REP0_ESC_WT_anno.bam, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep2/rep2REP0_ESC_WT_anno.bam
2024-05-06 18:30:27,362|autouri.autouri|INFO| cp: (33ec02d3) done.
2024-05-06 18:30:27,363|autouri.autouri|INFO| cp: (2aa6425e) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-0/execution/rep1REP0_ESC_WT_genome_flagstat.txt, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep1/rep1REP0_ESC_WT_genome_flagstat.txt
2024-05-06 18:30:27,368|autouri.autouri|INFO| cp: (2aa6425e) done.
2024-05-06 18:30:27,368|autouri.autouri|INFO| cp: (93e71f32) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-1/execution/rep2REP0_ESC_WT_genome_flagstat.txt, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep2/rep2REP0_ESC_WT_genome_flagstat.txt
2024-05-06 18:30:27,379|autouri.autouri|INFO| cp: (93e71f32) done.
2024-05-06 18:30:27,379|autouri.autouri|INFO| cp: (14cd7a88) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-0/execution/rep1REP0_ESC_WT_genome_flagstat.json, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep1/rep1REP0_ESC_WT_genome_flagstat.json
2024-05-06 18:30:27,390|autouri.autouri|INFO| cp: (14cd7a88) done.
2024-05-06 18:30:27,390|autouri.autouri|INFO| cp: (ec807f46) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-1/execution/rep2REP0_ESC_WT_genome_flagstat.json, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep2/rep2REP0_ESC_WT_genome_flagstat.json
2024-05-06 18:30:27,397|autouri.autouri|INFO| cp: (ec807f46) done.
2024-05-06 18:30:27,397|autouri.autouri|INFO| cp: (1147fdf7) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-0/execution/rep1REP0_ESC_WT_genome.bam, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep1/rep1REP0_ESC_WT_genome.bam
2024-05-06 18:30:36,849|autouri.autouri|INFO| cp: (1147fdf7) done.
2024-05-06 18:30:36,850|autouri.autouri|INFO| cp: (1c158e91) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-1/execution/rep2REP0_ESC_WT_genome.bam, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep2/rep2REP0_ESC_WT_genome.bam
2024-05-06 18:30:49,031|autouri.autouri|INFO| cp: (1c158e91) done.
2024-05-06 18:30:49,032|autouri.autouri|INFO| cp: (a13b1927) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-0/execution/rep1REP0_ESC_WT_Log.final.out, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep1/rep1REP0_ESC_WT_Log.final.out
2024-05-06 18:30:49,216|autouri.autouri|INFO| cp: (a13b1927) done.
2024-05-06 18:30:49,217|autouri.autouri|INFO| cp: (2e323032) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-1/execution/rep2REP0_ESC_WT_Log.final.out, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep2/rep2REP0_ESC_WT_Log.final.out
2024-05-06 18:30:49,223|autouri.autouri|INFO| cp: (2e323032) done.
2024-05-06 18:30:49,223|autouri.autouri|INFO| cp: (390185c6) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-0/execution/rep1REP0_ESC_WT_Log.final.json, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep1/rep1REP0_ESC_WT_Log.final.json
2024-05-06 18:30:49,235|autouri.autouri|INFO| cp: (390185c6) done.
2024-05-06 18:30:49,235|autouri.autouri|INFO| cp: (673c47fe) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-1/execution/rep2REP0_ESC_WT_Log.final.json, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep2/rep2REP0_ESC_WT_Log.final.json
2024-05-06 18:30:49,242|autouri.autouri|INFO| cp: (673c47fe) done.
2024-05-06 18:30:49,242|autouri.autouri|INFO| cp: (94a7f113) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-0/execution/align.log, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep1/align.log
2024-05-06 18:30:49,250|autouri.autouri|INFO| cp: (94a7f113) done.
2024-05-06 18:30:49,250|autouri.autouri|INFO| cp: (56c24d0c) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-align/shard-1/execution/align.log, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/align/rep2/align.log
2024-05-06 18:30:49,260|autouri.autouri|INFO| cp: (56c24d0c) done.
2024-05-06 18:30:49,261|autouri.autouri|INFO| cp: (518f8d5c) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-bam_to_signals/shard-0/execution/glob-c2cd76befcbd32d151a804902524f64f/rep1REP0_ESC_WT_genome_all.bw, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/bam_to_signals/rep1/rep1REP0_ESC_WT_genome_all.bw
2024-05-06 18:30:49,884|autouri.autouri|INFO| cp: (518f8d5c) done.
2024-05-06 18:30:49,885|autouri.autouri|INFO| cp: (629d4e80) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-bam_to_signals/shard-1/execution/glob-c2cd76befcbd32d151a804902524f64f/rep2REP0_ESC_WT_genome_all.bw, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/bam_to_signals/rep2/rep2REP0_ESC_WT_genome_all.bw
2024-05-06 18:30:50,285|autouri.autouri|INFO| cp: (629d4e80) done.
2024-05-06 18:30:50,286|autouri.autouri|INFO| cp: (711592c5) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-bam_to_signals/shard-0/execution/bam_to_signals.log, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/bam_to_signals/rep1/bam_to_signals.log
2024-05-06 18:30:50,291|autouri.autouri|INFO| cp: (711592c5) done.
2024-05-06 18:30:50,291|autouri.autouri|INFO| cp: (3fff2005) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-bam_to_signals/shard-1/execution/bam_to_signals.log, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/bam_to_signals/rep2/bam_to_signals.log
2024-05-06 18:30:50,308|autouri.autouri|INFO| cp: (3fff2005) done.
2024-05-06 18:30:50,309|autouri.autouri|INFO| cp: (b1bb78e6) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-bam_to_signals/shard-0/execution/glob-719231f59fe7b087c441b8513a2d90d6/rep1REP0_ESC_WT_genome_uniq.bw, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/bam_to_signals/rep1/rep1REP0_ESC_WT_genome_uniq.bw
2024-05-06 18:30:50,619|autouri.autouri|INFO| cp: (b1bb78e6) done.
2024-05-06 18:30:50,620|autouri.autouri|INFO| cp: (1f189dfb) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-bam_to_signals/shard-1/execution/glob-719231f59fe7b087c441b8513a2d90d6/rep2REP0_ESC_WT_genome_uniq.bw, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/bam_to_signals/rep2/rep2REP0_ESC_WT_genome_uniq.bw
2024-05-06 18:30:51,172|autouri.autouri|INFO| cp: (1f189dfb) done.
2024-05-06 18:30:51,173|autouri.autouri|INFO| cp: (235c5dff) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-kallisto/shard-0/execution/kallisto_quant.log, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/kallisto/rep1/kallisto_quant.log
2024-05-06 18:30:51,295|autouri.autouri|INFO| cp: (235c5dff) done.
2024-05-06 18:30:51,296|autouri.autouri|INFO| cp: (101427cc) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-kallisto/shard-1/execution/kallisto_quant.log, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/kallisto/rep2/kallisto_quant.log
2024-05-06 18:30:51,378|autouri.autouri|INFO| cp: (101427cc) done.
2024-05-06 18:30:51,379|autouri.autouri|INFO| cp: (1d43d72a) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-kallisto/shard-0/execution/kallisto_out/rep1REP0_ESC_WT_abundance.tsv, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/kallisto/rep1/rep1REP0_ESC_WT_abundance.tsv
2024-05-06 18:30:51,712|autouri.autouri|INFO| cp: (1d43d72a) done.
2024-05-06 18:30:51,713|autouri.autouri|INFO| cp: (5f86b16e) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-kallisto/shard-1/execution/kallisto_out/rep2REP0_ESC_WT_abundance.tsv, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/kallisto/rep2/rep2REP0_ESC_WT_abundance.tsv
2024-05-06 18:30:51,860|autouri.autouri|INFO| cp: (5f86b16e) done.
2024-05-06 18:30:51,861|autouri.autouri|INFO| cp: (c3af7f21) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-mad_qc/execution/glob-196368870fbaaecf98b381cba2f56e97/rep1REP0_ESC_WT_anno_rsem-rep2REP0_ESC_WT_anno_rsem_mad_qc_metrics.json, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/mad_qc/rep1REP0_ESC_WT_anno_rsem-rep2REP0_ESC_WT_anno_rsem_mad_qc_metrics.json
2024-05-06 18:30:51,861|autouri.autouri|INFO| cp: (c3af7f21) done.
2024-05-06 18:30:51,861|autouri.autouri|INFO| cp: (b7d49c41) started. src=/storage/aadams/scripts/encode/rna/rna/c8b7eb6a-2230-4bd1-a160-2e70c8ab5a8a/call-mad_qc/execution/glob-d58d85dd0e55e36d4d0c376d5f0c524e/rep1REP0_ESC_WT_anno_rsem-rep2REP0_ESC_WT_anno_rsem_mad_plot.png, dest=/storage/aadams/rna_analysis/pipelines/encode/rep0_ESC_WT/mad_qc/rep1REP0_ESC_WT_anno_rsem-rep2REP0_ESC_WT_anno_rsem_mad_plot.png
2024-05-06 18:30:51,862|autouri.autouri|INFO| cp: (b7d49c41) done.
Traceback (most recent call last):
  File "/home/aadams/.local/bin/croo", line 13, in <module>
    main()
  File "/home/aadams/.local/lib/python3.10/site-packages/croo/cli.py", line 247, in main
    co.organize_output()
  File "/home/aadams/.local/lib/python3.10/site-packages/croo/croo.py", line 263, in organize_output
    interpreted_subgraph = Croo.__interpret_inline_exp(
  File "/home/aadams/.local/lib/python3.10/site-packages/croo/croo.py", line 343, in __interpret_inline_exp
    result = result.replace(m.group(0), str(eval(m.group(1))), 1)
  File "<string>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

Croo was successful with copying another rna-seq-pipeline run, but I'm not sure why it is failing with this specific one. I am attaching the metadata.json file that gave the error.

metadata.json

jaavedm commented 1 month ago

I figured it out. It is a documented scenario within croo.py, but not handled in the code. It happens when the shard index is -1 In the method __interpret_inline_exp of croo.py, there is a comment about this scenario:

shard_idx: tuple of scatter indices. -1 means no scatter
                       e.g. (-1, 0, 1,):
                            no scatter in main workflow
                            scatter id 0 in subworkflow
                            scatter id 1 in subsubworkflow

The offending code does not handle this case, and the rna-seq-pipeline does return a metadata.json file where the shard_idx could be -1.

To workarround this scenario, I manually added the following code to the croo.py script.

         if i is None:
           i = -1

Which is inserted directly above


        while True:
            m = re.search(Croo.RE_PATTERN_INLINE_EXP, result)
            if m is None:
                break
            result = result.replace(m.group(0), str(eval(m.group(1))), 1)

Of course, a developer more knowledgeable about this codebase should investigate a proper way to handle the case where shard_idx is -1.