Closed fnindo closed 1 year ago
Hi @fnindo, can you please provide the following information.
/mnt/Bacteria/mdhhs-bact_results/work/bb/1ccf6fc06bc3b7bab3508104bedb5f
and see if any of these are empty and what the file names are?
DETERMINE_TOP_TAXA
and MASH_DIST
steps? Usually issues with FastANI like this are actually a problem a couple steps upstream so we will just have to track that down.
Hi Jill, The directory has these files: just removed patient Identifier and changed it to Sample_ID: .../work/bb/1ccf6fc06bc3b7bab3508104bedb5f$ ls sample_ID_best_MASH_hits.txt Klebsiella_pneumoniae_GCF_003597755.1_ASM359775v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_023657795.1_ASM2365779v1_genomic.fna.gz Sample_ID.filtered.scaffolds.fa.gz Klebsiella_pneumoniae_GCF_013282295.1_ASM1328229v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_023657815.1_ASM2365781v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_000465975.2_ASM46597v2_genomic.fna.gz Klebsiella_pneumoniae_GCF_016903295.1_ASM1690329v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_023657835.1_ASM2365783v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_001699105.2_ASM169910v2_genomic.fna.gz Klebsiella_pneumoniae_GCF_019797985.1_ASM1979798v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_023657855.1_ASM2365785v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_003036645.2_ASM303664v2_genomic.fna.gz Klebsiella_pneumoniae_GCF_021228815.1_ASM2122881v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_024917555.1_ASM2491755v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_003597695.1_ASM359769v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_021460155.1_ASM2146015v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_024917935.1_ASM2491793v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_003597715.1_ASM359771v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_023518315.1_ASM2351831v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_024918055.1_ASM2491805v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_003597735.1_ASM359773v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_023657735.1_ASM2365773v1_genomic.fna.gz Klebsiella_pneumoniae_GCF_024918095.1_ASM2491809v1_genomic.fna.gz
The command at that process is: Command executed:
fastANI \ -q Sample_ID.filtered.scaffolds.fa.gz \ --rl Sample_ID_best_MASH_hits.txt \ -o Sample_ID.ani.txt
cat <<-END_VERSIONS > versions.yml "PHOENIX:PHOENIX_EXTERNAL:FASTANI": fastani: $(fastANI --version 2>&1 | sed 's/version//;') END_VERSIONS
Command exit status: 139
Command output: (empty)
Here are the next steps for tracking down the problem:
We have seen issues with the sample name not being parsed correctly and then the fastANI doesn't work. Can you confirm that the files being passed to fastANI in the command match the names of the files are in the folder?
Rather than just running ls
can you use ls -lh
and confirm that the files aren't empty? When you run that command reading right to left it should be the file name, time, date and then the size of the file. Just confirm the size isn't zero for any of them.
Have a look in the .command.out
and .command.err
(using cat or more) that will be in the work dir /work/bb/1ccf6fc06bc3b7bab3508104bedb5f
these are "hidden files" so you need to run ls -la
to see them. These files likely will just say the same error as was in the nextflow exit error, but worth a try to confirm there isn't an error contained in there.
If the above things look correct, we will need to check upstream. To do this go into the .nextflow.log
file that will be in the directory where you ran PHoeNIx from. Again, this a "hidden file" so you need to run ls -la
to see it. Go into that file and search for the DETERMINE_TOP_TAXA
and MASH_DIST
steps. You can either open it in an editor or use cat .nextflow.log | grep "] Submitted process > PHOENIX:PHOENIX_EXTERNAL:DETERMINE_TOP_TAXA"
. You are looking for a line like this:
Jul-14 12:06:41.069 [Task submitter] INFO nextflow.Session - [80/bbeb64] Submitted process > PHOENIX:PHOENIX_EXTERNAL:DETERMINE_TOP_TAXA (SampleID)
This part [80/bbeb64]
of the line tells us where to find the work directory for that step. So go into that folder .../work/80/bbeb64...
(you will need to press tab after bbeb64
to auto complete the rest of the folder name) and have a look in the .command.out
and .command.err
(using cat or more).
It was confirmed that ref genomes from MASH_DIST
step that were being passed down to FastaANI were empty. System admin for @fnindo identified proxy issues that led to failed retrieval of ref genomes. Once the proxy issue was resolved, the analysis ran successfully.
[33/7af312] NOTE: Process
PHOENIX:PHOENIX_EXTERNAL:FASTANI (SampleID)
terminated with an error exit status (139) -- Execution is retried (1) Error executing process > 'PHOENIX:PHOENIX_EXTERNAL:FASTANI (SampleID)'Caused by: Process
PHOENIX:PHOENIX_EXTERNAL:FASTANI (SampleID)
terminated with an error exit status (139)Command executed:
fastANI \ -q SampleID.filtered.scaffolds.fa.gz \ --rl SampleID_best_MASH_hits.txt \ -o SampleID.ani.txt
cat <<-END_VERSIONS > versions.yml "PHOENIX:PHOENIX_EXTERNAL:FASTANI": fastani: $(fastANI --version 2>&1 | sed 's/version//;') END_VERSIONS
Command exit status: 139
Command output: (empty)
Command error:
Work dir: /mnt/Bacteria/mdhhs-bact_results/work/bb/1ccf6fc06bc3b7bab3508104bedb5f
Tip: when you have fixed the problem you can continue the execution adding the option
-resume
to the run command line