Closed alper1976 closed 2 years ago
The AMOS log was a nice inclusion!
Can you try running
LD_LIBRARY_PATH=/cluster/projects/nn9745k/scripts/conda_envs/squeezemeta/SqueezeMeta/bin/AMOS/../../lib/mummer /cluster/projects/nn9745k/scripts/conda_envs/squeezemeta/SqueezeMeta/bin/AMOS/../mummer/nucmer --maxmatch --threads 12 -c 100 /cluster/work/users/alexaei/02_results/13_svalbard_metaGs/seqmerge/temp/mergedassemblies.seqmerge.ref.seq /cluster/work/users/alexaei/02_results/13_svalbard_metaGs/seqmerge/temp/mergedassemblies.seqmerge.qry.seq -p /cluster/work/users/alexaei/02_results/13_svalbard_metaGs/seqmerge/temp/mergedassemblies.seqmerge
Let's see if we get a more detailed error message there
Hello I think that this issue is related with memory usage by minimus2, like that in #109. Indeed the syslog file contains many instances of that very same error:
k-mer db: Building database
sh: line 1: 71296 Killed LD_LIBRARY_PATH=/cluster/projects/nn9745k/scripts/conda_envs/squeezemeta/SqueezeMeta/lib /cluster/projects/nn9745k/scripts/conda_envs/squeezemeta/SqueezeMeta/bin/kmer-db build -t 12 /cluster/work/users/alexaei/02_results/13_svalbard_metaGs/seqmerge/temp/samples.seqmerge.txt /cluster/work/users/alexaei/02_results/13_svalbard_metaGs/seqmerge/temp/kmerdb.seqmerge.txt -k 12 > /dev/null 2>&1
Error running command: LD_LIBRARY_PATH=/cluster/projects/nn9745k/scripts/conda_envs/squeezemeta/SqueezeMeta/lib /cluster/projects/nn9745k/scripts/conda_envs/squeezemeta/SqueezeMeta/bin/kmer-db build -t 12 /cluster/work/users/alexaei/02_results/13_svalbard_metaGs/seqmerge/temp/samples.seqmerge.txt /cluster/work/users/alexaei/02_results/13_svalbard_metaGs/seqmerge/temp/kmerdb.seqmerge.txt -k 12 > /dev/null 2>&1 at /cluster/projects/nn9745k/scripts/conda_envs/squeezemeta/SqueezeMeta/lib/SqueezeMeta/kmerdist.pl line 65.
Could you try the same solution given in that issue?
Hello I have redone 01.merge_sequential.pl but it is still crashing with big datasets. But, we are trying decreasing the kmer size in kmer-db, and so far it is working with just 16 Gb RAM. Decreasing k-mer size implies less k-mers, and therefore reduces memory usage. If you want to try, edit the script kmerdist.pl in the lib/SqueezeMeta directory. There, change line 61: $command="$kmerdb_soft build -t $numthreads $samples $kmerdb > /dev/null 2>&1"; by $command="$kmerdb_soft build -t $numthreads $samples $kmerdb -k 12 > /dev/null 2>&1"; That will make kmer-db to user kmer size 12, instead the original 18. Since we just want to calculate an approximate similarity measure between metagenomes, it will suffice for our purposes.
Best, J
Thanks. This is the output that I get. Not sure if this helps.
The following modules were not unloaded: (Use "module --force purge" to unload all):
1) StdEnv ERROR: failed to merge alignments at position 256 Please file a bug report
Task and CPU usage stats: JobID JobName AllocCPUS NTasks MinCPU MinCPUTask AveCPU Elapsed ExitCode
4074012 sqmeta 12 1-02:45:37 1:0 4074012.bat+ batch 12 1 13-08:26:+ 0 13-08:26:+ 1-02:45:37 1:0 4074012.ext+ extern 12 1 00:00:00 0 00:00:00 1-02:45:37 0:0
Memory usage stats: JobID MaxRSS MaxRSSTask AveRSS MaxPages MaxPagesTask AvePages
4074012 4074012.bat+ 19248480K 0 19248480K 0 0 0 4074012.ext+ 0 0 0 0 0 0
Disk usage stats: JobID MaxDiskRead MaxDiskReadTask AveDiskRead MaxDiskWrite MaxDiskWriteTask AveDiskWrite
4074012 4074012.bat+ 33.41M 0 33.41M 0.06M 0 0.06M 4074012.ext+ 0.00M 0 0.00M 0 0 0
Job 4074012 completed at Mon Oct 25 13:55:01 CEST 2021.
I also attached my kmerdist.pl.
Best, Ale kmerdist.pl.txt x
What is the syslog for this last run? Best, Fernando
We the command that you provided it did not update the syslog.
Ah, so that output was for the command I sent? Ok, so nucmer is failing for that merge. Maybe we could try updating it to the latest version and see if that somehow helps. Still, it might be better to first try the solution proposed by @jtamames, since it's true that kmer-db had been killed (possibly out of memory) before nucmer being run. How much memory are you requesting in your cluster? Best, Fernando
Attached the slurm script run_squezzemeta_seqmerge_rerun.slurm.txt
If you check my kmerdist.pl you should see that I adapted the suggestion by @jtamames
Yes, I saw that. However the command I told you to run was just to check whether nucmer was failing (it is), so the modifications from @jtamames didn't come into play yet. So nucmer is failing for sure. By looking at the batch script I think you are requesting enough memory (unless your assemblies are really big). However, since kmer-db was also failing before nucmer (which I didn't realize by the time I wrote the first message) I would try to look into that first. I think there is a way to repeat the merging step (so that the changes to the script apply) without having to restart everything from scratch. @jtamames can you help with this?
Yeah, please check #351 for restarting seqmerges
Checking the syslog. In the last runs the kmer-db and nucmer seemed to run fine. It crashes in the scaffolding step if I am not mistaken.
By looking at the lines before the last crash, you might be right.
Total CPU time 160878.05
Transforming to afg format: /cluster/projects/nn9745k/scripts/conda_envs/squeezemeta/SqueezeMeta/bin/AMOS/toAmos -s /cluster/work/users/alexaei/02_results/13_svalbard_metaGs/seqmerge/temp/mergedassemblies.seqmerge.99.fasta -o /cluster/work/users/alexaei/02_results/13_svalbard_metaGs/seqmerge/temp/mergedassemblies.seqmerge.afg
Merging with minimus2: /cluster/projects/nn9745k/scripts/conda_envs/squeezemeta/SqueezeMeta/bin/AMOS/minimus2_mod /cluster/work/users/alexaei/02_results/13_svalbard_metaGs/seqmerge/temp/mergedassemblies.seqmerge -D OVERLAP=100 -D MINID=95 -D THREADS=12 > /dev/null 2>&1
Stopping in STEP1.5 -> 01.merge_sequential.pl. Program finished abnormally
Then I wonder why kmer-db
and nucmer
failed in previous runs and worked in that one...
Anyways, then maybe you can try running the last command that failed alone, removing the redirection to /dev/null
so that it's hopefully verbose about what's happening.
/cluster/projects/nn9745k/scripts/conda_envs/squeezemeta/SqueezeMeta/bin/AMOS/minimus2_mod /cluster/work/users/alexaei/02_results/13_svalbard_metaGs/seqmerge/temp/mergedassemblies.seqmerge -D OVERLAP=100 -D MINID=95 -D THREADS=12
I increased the memory
Here the output from the last run. Seems like nucmer has issues.
Hi!
What's the memory capacity of your nodes?
MaxRSSTask 19338352K
may be due to a cap of 20 Gb for this task.
Hi. Thanks for your help this far. I fixed the memory issues but I am now running into time issues. Any suggestions to optimize the scaffolding. I already removed contigs shorter than 500 bp and run it for 14 days on 12 threads.
Hi,
Sorry for the late response. Time issues might be difficult to solve. Minimus2 is a large blocker in this regard, as it works well for a moderate amount of samples but scales poorly when the input data becomes large. We have not been able to find a direct replacement to it, so at some point you will have to use either a coassembly (if you have enough RAM) or analyze each sample separately with the sequential mode.
An intermediate road would be to divide your samples in groups. For example you could use Mash to calculate pairwise distances between your metagenomes, cluster them based on similarity, and then analyze each cluster of samples independently.
Hi @fpusan , Thanks for a great tool and being so responsive. I've used SqueezeMeta for smaller batches of ~15 samples. Now I have a very large project (485 samples, although this includes positive/negative controls & some low-read samples I could remove, >800GB, paired reads, sample files highly variable in read count but median 10 M reads) and I'm trying to strategize my approach. Initially I was planning on using seqmerge mode, but reading this issue I'm concerned merging time due to minimus2 will be a problem for me too, given the number of samples and the n-1 merges needed. I can remove some low-read count samples but while this would reduce the number of merges needed it would not reduce very much the size of the pooled samples.
My goal is to compare general taxonomic and functional attributes across conditions but I'm also interested in bins. The following resources are available to me on a HPC, max walltime of 14 days:
Since I don't think Megahit is MPI enabled, I'm stuck with these per node limits. So far reading over the different issues on this, I think there are maybe a few different strategies:
Do you have any suggestions as to which would be better? or is there something else I should consider doing?
Hi!
I'm a bit short of time but I'll try to give a quick answer. I would so for the first option you propose:
1) Run SqueezeMeta individually for each sample, only use metabat2 for binning 2) Combine the results as described in #153 3) Get all the bins together and merge them into ANI > 95% clusters. 4) You can use drep to derreplicate, but by doing so you will lose some info about the accessory genome. I have a new preprint on how to tackle this, maybe it will be of interest? (https://www.biorxiv.org/content/10.1101/2022.03.25.485477v1)
Of course the option with MASH is also cool and sound, and it will assemble more low abundance taxa. But the first one is easier and scales better. If you end up comparing both, I'd love to know the results!
Also I noticed that you posted something about not being able to install SqueezeMeta with mamba, but now can't find the issue. Is the problem still happening?
Thank you! I will do as you suggest at first. I am a little concerned because there are some low read samples that will probably not have very good assembly to contigs on their own, but these samples probably shouldn't be analyzed anyway. I'll read your preprint, thanks very much! Depending how it goes, I may be back with more questions :grimacing:
Yes, sorry, I deleted the mamba comment - in order to install using mamba, I had to create a new base environment instead of using the base conda environment provided on my HP. Then I didn't realize that in my new base environment there was no mkl. Installing numpy to the new base environment got me mkl, so it was just me being dense.
Ah good to know! Next time maybe close the issue rather than deleting it, your experience may help others, and also I will be happier knowing that for once something was not my fault!
Attached find some relevant log files and the run script
mergelog.txt syslog.txt slurm-4025637.out.txt mergedassemblies.seqmerge.runAmos.log.txt