Closed jonn-smith closed 4 years ago
Uhoh, that's no good. I'll fix this first thing tomorrow.
Yeah - I made some workarounds in my workflows to get past it. I didn't look at the code for the tool, so I'm not sure what exactly it's grabbing that it's not expecting.
I'm not moving the test files, so they'll be there for you to test with.
Fixed and merged, but see https://github.com/broadinstitute/long-read-pipelines/issues/174 for future considerations on this task.
I ran into an issue where
PBUtils.GetRunInfo
is producing some garbage values in the fields it creates based on the input here: gs://broad-dsde-methods-long-reads/covid-19-aziz/small_test_set/I created that file as a small test set before running the whole dataset again. My processing of the data may have something to do with why the info is getting mangled.
run_info.txt
:Note the values of the
DT
andSM
fields.DT The
DT
field should not have the-t 4 /cromwell_root/broad-dsde-methods-long-reads/covid-19-aziz/Homo_sapiens_assembly38_SARS_CoV-2.fasta -
text in it. Its presence was causing the MiniMap2 alignment tasks to fail in an unexpected way - it was running out of memory because it thought the reference was a FASTA file to align (it was effectively being given that text multiple times as arguments).SM The
SM
field should not end in a trailing backslash. This causedMergeBams
tasks to fail when they looked for bam indices with the literal\
character in them.I suspect that all of the parsing this is doing may suffer from similar issues.