jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
365 stars 78 forks source link

Is it possible running SQM in Merged mode with MinION long reads? #598

Closed nutrimol closed 1 year ago

nutrimol commented 1 year ago

Dear developers,

My data set consists of 3 metagenomes and 3 metatranscriptomes obtained with MinION sequencer. I have tried Squeezemeta in Merged mode to follow the strategy suggested in "Combining-metagenomes-and-metatranscriptomes-with-SqueezeMeta", but so far with no success.

I tried two different options (among others):

  1. Providing the path to the six fasta files as extassemblies (with polished assemblies of the respective metagenomes and metatranscriptomes) in addition to the six fastq.gz files. samples.samples syslog

  2. Discarding the use of extassemblies option in samples.samples file and providing only the fastq.gz files containing the reads. samples.samples syslog Megahit log

In the second case, the program is killed when running Megahit. It gives the error when reading the inputs.I tried with different fastq files and I get the same results. In the first case I am not sure where the problem is because ./data/megahit/ folder is even not created when it stops.

I was wondering that perhaps this strategy is not compatible with the Minion long reads...or maybe I am missing something important here that I do not understand. I have already performed the same analysis but in co-assembly mode with very good results. This time I was exploring the merged mode since the idea of recovering those transcripts that were not mapped in the metagenomes was very attractive to me!

In the case it is not possible to use it with Nanopore reads, what would you suggest to recover all that information (transcripts) that were not found in the assembled metagenomes. I would really appreciate your opinion about this!

Thank you so much in advance,

Virginia.

jtamames commented 1 year ago

Hello In the first case, something is going wrong, because according to your samples file, it should not proceed with the assembly, but it does. I will dig into that. But the crash is a megahit issue. Indeed both times is crashing when trying to assemble reads with megahit. Why are you specifying megahit for assembler with minion reads? You should better use flye or canu, which are tailored for working with minion reads. Could you try using one of both? As said, I will check the bug in the first instance. Best, J

nutrimol commented 1 year ago

Dear Javier, Thank you for digging into that. Regarding the use of flye as assembler, that was my first though, but I also got an error when using -a flye in the SQM command. In that case we were providing the extassemblies for the three metagenomes, and only the raw reads for the three metatranscriptomes:

[1mUsage: SqueezeMeta.pl -m <mode> -p <project name> -s <samples file> -f <sequence dir> [options][0mInvalid combination of mode and assembler (We are sorry for this, the low number of contigs provided by Flye prevents minimus2 needed in merged mode to work correctly Please use coassembly, or a different assembler) I 've sent a job without extassemblies and flye in merged mode. I will let you know the output. Best regards and thank you so much for your time, Virginia.

nutrimol commented 1 year ago

Dear Javier,

When I try to run the analysis with the six samples from scratch (with Flye as assembler) I get the same message as before:


Invalid combination of mode and assembler
 (We are sorry for this, the low number of contigs provided by Flye prevents minimus2 needed in merged mode to work correctly
 Please use coassembly, or a different assembler)
samples file:
c40_Id0 c40_Id0.fastq.gz        pair1
c40_Ld3 c40_Ld3.fastq.gz        pair1
c40_Ld11        c40_Ld11.fastq.gz       pair1
l100_MTXbc1     l100_MTXbc1.fastq.gz    pair1   nobinning
l100_MTXbc2     l100_MTXbc2.fastq.gz    pair1   nobinning
l100_MTXbc3     l100_MTXbc3.fastq.gz    pair1   nobinning

That is why I though merged mode was not compatible with long reads. I did not try Canu because it takes a while to run. Should I?

jtamames commented 1 year ago

Oh yes, we deactivated the combination merged+flye because flye returns very few contigs and then minimus2 crashes. You can try canu, yes. Is there any other assembler you use? Because version 1.6 enables plugging additional assemblers. Best, J

nutrimol commented 1 year ago

I will try Canu and then, I will be back to inform. Thanks! Virginia.

fpusan commented 1 year ago

Closing due to lack of activity, feel free to reopen