Open iek opened 3 years ago
Hi. Thanks for the bug report. I will look at this as soon I have time.
Thank you very much!
Hi iek, sorry for the late reply on this, we've been very busy. Could you share your mummer/cmd_delta.txt
file and see if you can run one of the command line found inside?
Hi everyone, we've worked on this issue with another user and this one might have several origins.
If you see this error for the first time
Please look first at the first lines of your intermediate file mummer/cmd_delta.txt
which should contain multiple nucmer command lines. Pick one and try to run it without the redirection at the end (> /dev/null 2> /dev/null
). Note that the behavior of this command line might be different if you run it inside an interactive session or from a cluster job. Don't forget to load the environment used for your OPERA-MS run too.
See bellow if your error has been listed, otherwise let us know by providing your error in this thread with a new comment.
Can't locate Foundation.pm in @INC
It looks like that perl is not able to find the Fundation module linked to MuMMer. Try to clean reinstall mummer in tools_opera_ms\MUMmer3.23
. Look at tools_opera_ms\install_mummer3.23.sh
to see how it has been automatically installed with OPERA-MS. Confirm that the mummer directory does not move. It is possible that on some clusters, the configuration leads to some error, check with your system admin. Ultimately, you need to check that the executable tools_opera_ms\MUMmer3.23\nucmer
works. This means that you can also provide an other nucmer executable with a symlink, not however that we did not test OPERA-MS with other mummer version than 3.23.
On the long run
This issue is part of our motivation to provide a conda packaging with external tools not being directly linked to OPERA-MS. We're still working on this, sorry for the delay.
I have a simmilar error on my own dataset (it works fine on the test dataset) : OperaMS's log says there's an error in gap filling, and invite me to check gap_filling.err. I do, and it says there was an error during tiling generation, and invite me to check tilling_1.out and tilling_1.err. The .out exists but is empty, and in the .err, in the " *** run the mummer mapping" step, there's an "Error in during nucmer."
In the "intermediate_files/opera_long_read/GAPFILLING/mummer/cmd_delta.txt" file, there's only a single nucmer command line :
/scratch/nimauric/metagenomic_benchmark/workflow/dependencies/OPERA-MS//tools_opera_ms//MUMmer3.23//nucmer --nosimplify --maxmatch -p /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/mummer/split_1.fa_split_1.fa /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/TILLING/REF/split_1.fa /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/TILLING/QUERY/split_1.fa > /dev/null 2> /dev/null
When I run it without the "> /dev/null 2> /dev/null", I get this output :
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
# reading input file "/scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/mummer/split_1.fa_split_1.fa.ntref" /scratch/nimauric/metagenomic_benchmark/workflow/dependencies/OPERA-MS/tools_opera_ms/MUMmer3.23/mummer: empty sequence in multiple fasta file
ERROR: mummer and/or mgaps returned non-zero
I check the content of "/scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/mummer/split_1.fa_split_1.fa.ntref", and it only contains this line :
>allcontigs /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713/intermediate_files/opera_long_read/GAPFILLING/TILLING/REF/split_1.fa
"/scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713/intermediate_files/opera_long_read/GAPFILLING/TILLING/REF/split_1.fa" itself exist, but is an empty file.
I also have a "nucmer.error" file, which contains : 20230217|150203| 12678| ERROR: mummer and/or mgaps returned non-zero
I tried reinstalling mummer, and it didn't fix the problem.
Hi Bordeterre,
sorry to hear that! It looks like OPERA-MS generated an empty fasta file in your case. I'll look into this as soon as possible but it might be a rare unlucky error. Maybe resampling reads could solve this issue if you want to try a very quick fix.
Note for me: Check if this does not result from a special case within split_fasta_file function.
JS
Hi Bordeterre,
could you check for me the content of your file /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713/intermediate_files/opera_long_read/GAPFILLING/TILLING/consensus.fa
?
If it's not empty, could you send it to me for testing purposes?
If it is empty, could you check your log file at intermediate_files/opera_long_read/GAPFILLING/consensus_cmd.sh
and try to run one of the commands here (without the log redirection)?
Thank you very much and sorry for this issue! JS
Hi JS,
/scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713/intermediate_files/opera_long_read/GAPFILLING/TILLING/consensus.fa
is indeed an empty file.
intermediate_files/opera_long_read/GAPFILLING/consensus_cmd.sh
is also an empty file.
Concerning the resampling of reads, and perhaps I should have mentioned it sooner, I have a biased dataset where, while the short and long reads were sequenced from the same community, I filtered the long reads as to only keep those that map to one of two specific bacteria present in the community (As a way to produce a lighter dataset for faster testing) (This filtered dataset to did produce satisfying contigs on pure long-reads assembler). Do you think the problem might stem from this discrepancy between the short and long read dataset ?
Thanks for this assembler and your help on this issue, NM
OK this is definitely very weird. I'm not sure the exact source of error though.
Could you share the content of your gapfilling directory?
tree /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713/intermediate_files/opera_long_read/GAPFILLING/
If you have the file extract_read.err
inside this folder, could you cp the content here?
Thanks, JS
With ls -lh
, I get :
-rw-r--r-- 1 nimauric genscale 0 Feb 17 14:32 consensus_cmd.sh
-rw-r--r-- 1 nimauric genscale 0 Feb 17 14:32 consensus_cmd.sh.log
-rw-r--r-- 1 nimauric genscale 0 Feb 17 14:32 contig_extention.log
-rw-r--r-- 1 nimauric genscale 0 Feb 17 14:32 edge_err
-rw-r--r-- 1 nimauric genscale 783 Feb 17 14:32 extract_read.err
-rw-r--r-- 1 nimauric genscale 4.7K Feb 17 14:32 gap_filling.err
-rw-r--r-- 1 nimauric genscale 0 Feb 17 14:32 gap_size.dat
drwxr-xr-x 2 nimauric genscale 0 Feb 17 14:32 LOG
drwxr-xr-x 2 nimauric genscale 5 Feb 17 14:53 mummer
drwxr-xr-x 4 nimauric genscale 5 Feb 17 14:32 TILLING
In extract_reads.err
, I have :
*** Starting the sequence extraction
rm -rf /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/LOG
mkdir -p /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/LOG
*** Number of edges selected for gapfilling 0
/scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/contig_extention.log
*** Extract gap sequences from read file /scratch/nimauric/metagenomic_benchmark/workflow/../data/input_reads/toy-SRR8073713.fastq
*** Reading contig file
*** Read the scaffold file and fill gaps
Sorry for the late reply. There are no edges selected for gapfilling, that's the reason why OPERA-MS is crashing. This is something we should be able to fix but I try to not touch OPERA-MS code as much as I can. I imagine this is not something that would happen for real data and might originate from your test dataset. Would you mind increasing the number of long reads if you didn't do it already?
It did come from the biased subsampling, and OPERA-MS produces an assembly on the full dataset
Thank you !
Hello, I am trying to run Opera-MS on the test dataset and have run into an error about Gapfilling and Mummer. To troubleshoot, I removed the "> /dev/null 2> /dev/null" line in "run_mummer_large_ref.pl" because I was unable to see exactly what the error was. I then received this message:
It seems that Mummer is missing the Query file?