jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
346 stars 81 forks source link

metaFlye #855

Closed athavars closed 1 week ago

athavars commented 1 week ago

Dear Fernando and Javier, we are using the SqueezeMeta for metagenomic analysis of nanopore sequences and we co-assemble the samples with Flye. We have seen that various researchers are using the metaFlye, considering that is works better than Flye.

Is the metaFlye an option for SqueezeMeta? Thank you in advance for your answer Athanasía

fpusan commented 1 week ago

Dear Anastasia, The -a flye option in SqueezeMeta will already use metaFlye (by adding the --meta flag when calling flye) Hope this helps, Fernando

athavars commented 1 week ago

Dear Fernando, thank you very much for the quick answer. So the command for running SqueezeMeta would be:

SqueezeMeta.pl -m (mode) -p (project) -s (file) -f (input dir) -a flye --meta

Is that right? And two more questions have just occurred to me. I can see the option --euk. We did a run with our samples without that option (feces samples from dairy cow) and we were surprised that only 0.2% of the classification corresponds to eukaryotic cells. Could it be because we didn´t use the euk option? We did have a lot of unclassified. Could it be this a possible reason? I think we should have as much eukaryotic DNA classified as possible so that we could take it out from the analysis as it could affect the statistics analysis.

And the last question. Bowtie is used by default. Would it be better if we use minimap2-ont, as we have nanopore sequences?

In that case the command would be: SqueezeMeta.pl -m (mode) -p (project) -s (file) -f (input dir) -a flye --meta -map minimap2-ont

Am I right?

Sorry for the cascade of questions. Best regards Athanasía

fpusan commented 1 week ago

No need to add --meta, it will be included when you use -a flye when calling SQM

athavars commented 1 week ago

Ok, thank you!

What about the --euk and the minimap2-ont options? I copy the above message as I edited it a couple of times and you might have missed it:

"And two more questions have just occurred to me. I can see the option --euk. We did a run with our samples without that option (feces samples from dairy cow) and we were surprised that only 0.2% of the classification corresponds to eukaryotic cells. Could it be because we didn´t use the euk option? We did have a lot of unclassified. Could it be this a possible reason? I think we should have as much eukaryotic DNA classified as possible so that we could take it out from the analysis as it could affect the statistics analysis.

And the last question. Bowtie is used by default. Would it be better if we use minimap2-ont, as we have nanopore sequences?

In that case the command would be: SqueezeMeta.pl -m (mode) -p (project) -s (file) -f (input dir) -a flye -map minimap2-ont

Am I right?

Sorry for the cascade of questions. Best regards Athanasía"

Thanks again!

fpusan commented 1 week ago

Yes, you should definitely use the minimap2-ont option.

Regarding the --euk option, it will not make a difference if you analyze the data using SQMtools (the default behaviour there will emulate what you get using the --euk flag) but it will change the taxonomic annotation of the tables produced directly by SqueezeMeta (this is, those files under the intermediate and results directories whose names start with a number). So if you are using those directly then you may see a difference.

Another option would be to use the -D flag, which will add an extra ORF prediction step (in addition to prodigal, which is trained for prokaryotic sequences). In our experience it increases the percentage of annotation, but not by much. In general we've found that samples with a high proportion of eukaryotes are challenging, in some cases using sqm_longreads.pl has given us a better picture of the community (since it is free of assembly biases).

athavars commented 1 week ago

Thank you very much!

That is for now! I am closing the issue. Thank you"