Open gbdias opened 8 months ago
I would prefer this is added to the run nextflow.config
rather than the whole workflow, but the underlying question is why is MitoHifi not performing here. Can you elaborate on the problems?
Then let's create a strategy to deal with it.
So what exactly is the cause of failure? Is it MitoHifi exiting with an error (non-zero exit code?), or does Nextflow exit with an error because of certain files not being present (we can make a PR to change the module on nf-core modules, or make a local module with that behaviour - Then if an output is empty it could trigger a different process)?
It seems we should add a section to do assembly from reads. Should we make this the default, rather than from contigs? Should this be a complementary assembly strategy like the hifiasm assembly (e.g. so one can just assemble the organelles ( or choose mitochondria/chloroplast specifically)?
Exit status 1, failed
22 97/a7593a 45307198 EVALUATE_RAW_ASSEMBLY:MERQURYFK_MERQURYFK (hifiasm-raw-default) COMPLETED 0 2024-03-01 04:18:16.174 4m 44s 52.7s 243.2% 1.1 GB 11.5 GB 11.4 GB 3.3 GB
23 9c/8ef0a5 45307224 MITOHIFI_MITOHIFI (hifiasm-raw-default) FAILED 1 2024-03-01 04:22:20.746 4m 15s 25s - - - - -
19 00/6cbb7e 45307192 EVALUATE_RAW_ASSEMBLY:BUSCO (hifiasm-raw-default-basidiomycota_odb10) ABORTED - 2024-03-01 04:18:15.646 - - - - - - -
Command error:
INFO: Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
INFO: Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
INFO: Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
INFO: Environment variable SINGULARITYENV_SNIC_TMP is set, but APPTAINERENV_SNIC_TMP is preferred
Matplotlib created a temporary config/cache directory at /scratch/45307224/matplotlib-2e7gi_8t because the default path (/home/guibo205/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Attention!
'parsed_blast.txt' and 'parsed_blast_all.txt' files are empty.
The pipeline has stopped !! You need to run further scripts to check if you have mito reads pulled to a large NUMT!
There's oneish alternative strategy in the notes you listed. Did you do the same ( i.e. use MBG )? Did that give you a mitogenome? What about oatk?
Yes. I pulled mtdna reads using minimap2 and a reference, then de-novo assembled using MBG. I'm not sure it is worth it to implement this in a module right now because multiple k-mer and window sizes need to be tested in until you get a single circular contig. Not sure a set of default parameters would transfer well between datasets, but it could be the target of some development from us.
Haven't tried oatk yet.
Will leave issue open as a reminder to implement alternative mitogenome strategy.
Related paper: MBG https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8521641/
Note: Update directive value to ignoreThenFail
when it gets merged to core Nextflow.
Is your feature request related to a problem? Please describe. In my experience MitoHiFi has often not worked and I need to employ another strategy to assemble the mtdna (plus additional organelles).
Describe the solution you'd like I suggest that a failure in the mitohifi process is ignored and the remaining workflow proceeds normally. Like:
errorStrategy 'ignore'