CDCgov / phoenix

🔥🐦🔥PHoeNIx: A short-read pipeline for healthcare-associated and antimicrobial resistant pathogens
Apache License 2.0
55 stars 19 forks source link

[BUG] - AMRFinder failing with core dump error on Terra #143

Closed cimendes closed 7 months ago

cimendes commented 7 months ago

Describe the bug

When running on Terra, version v2.1.0 and v2.0.2, the workflow isn't successful, failing with the following error message:

*** ERROR ***
'/opt/conda/envs/amrfinderplus/bin/blastn' -query '2024CW-00002.filtered.scaffolds.fa' -db /cromwell_root/tmp.5ad5e589/amrfinder.GROWyp/db/AMR_DNA-Klebsiella_oxytoca -evalue 1e-20 -dust no -max_target_seqs 10000 -num_threads 2 -mt_mode 1 -outfmt '6 qseqid sseqid qstart qend qlen sstart send slen qseq sseq' -out /cromwell_root/tmp.5ad5e589/amrfinder.GROWyp/blastn > /cromwell_root/tmp.5ad5e589/amrfinder.GROWyp/log 2> /cromwell_root/tmp.5ad5e589/amrfinder.GROWyp/blastn-err
status = 34304
terminate called after throwing an instance of 'ncbi::CCoreException'
terminate called recursively
Aborted (core dumped)

The workflow was run with default parameters, which correspond to 8 CPUs and 64 GB of RAM. I've tried running again by increasing the RAM provided (128 GB) but the error persisted (I didn't change the CPU as it seems to be set to 2 regardless of the number of CPUs provided)

Do you have any insight on what could be the cause of this error? Thank you!

jvhagey commented 7 months ago

Hey @cimendes, I haven't seen this before. Is it sample specific? Are there any public files (raw fastqs) that I can use to recreate this?

michellescribner commented 7 months ago

Hi @jvhagey! Yes, this was a sample specific issue and unfortunately not yet submitted to a public repository. It will be submitted after other QC metrics can be assessed. We'll try to provide a few more details shortly!

cimendes commented 7 months ago

Hello @jvhagey! After increasing the resources provided to phoenix the workflow finished successfully. I'm closing this issue as phoenix was innocent in this failure 😄

Thank you for the assistance!

jvhagey commented 7 months ago

@cimendes what did you have to raise it to? Just so folks know how to fix it. I assume since its sample specific it doesn't mean we need to reconsider the default resource settings?

michellescribner commented 7 months ago

@jvhagey, Ines increased the 'memory' parameter in the Phoenix workflow (v2.1.0) on Terra to 128 which resulted in successful workflow completion. This is not a common issue to our knowledge so may not require a change to default resource settings. The reads have now been submitted to NCBI with accession SRR28055183 if you'd like to try for yourself! Thanks again!