metagentools / MetaCoAG

🚦🧬 Binning Metagenomic Contigs via Composition, Coverage and Assembly Graphs
https://metacoag.readthedocs.io/en/stable/
GNU General Public License v3.0
57 stars 5 forks source link

Question about using MEGAHIT assembly #15

Closed jarrodscott closed 2 years ago

jarrodscott commented 2 years ago

Hi there!

I had a question about using MEGAHIT assemblies with MetaCoAG. In the MetaCoAG README it gives the following example:

megahit -1 Reads_1.fastq -2 Reads_2.fastq --k-min 21 --k-max 77 -o /path/output_folder -t 8

I am using --presets meta-sensitive which sets the parameters to --min-count 1 --k-list 21,29,39,49,...,129,141. Great. The instructions then say to run this on a fastg file.

fastg2gfa final.fastg > final.gfa

So I need a fastg file from the MEGAHIT assembly. As I understand it, to get a fastg file from a MEGAHIT assembly, I need to run megahit_core contig2fastg, however the input for this command is not the final assembly but instead one of the intermediate kmer contig files in the intermediate_contigs/ directory. The help file for this command says contig2fastg convert MEGAHIT's k*.contigs.fa to fastg format

As an example you run it like so: megahit_core contig2fastg 141 k141.contigs.fa > k141.fastg

My question is, in order to run MetaCoAG, does the a gfa (generated from fastg file) need to come from the final assembly file? If so, how can I generate a fastg from the final assembly file?

Thanks!!

jarrodscott commented 2 years ago

OK. I think I answered my own question :) The final k*.contigs.fa file is the assembly so I can generate a fastg file from that, convert it to .gfa, and then use the final assembly file to run MetaCoAG. Sorry for the confusion but I think I understand.

Vini2 commented 2 years ago

Hello @jarrodscott,

I'm extremely sorry for getting back late to you. The MEGAHIT assembly should result in the final contig file as final.contigs.fa. You can use this file to get the assembly graph using the following command.

megahit_core contig2fastg 141 final.contigs.fa > final.fastg

MetaCoAG can run on the assembly from any k value by generating the assembly graph from the contig files of the relevant k value. However, the quality of the assembly may affect the binning.

Hope this helps. Let me know if you have any further questions. Your input and suggestions are highly appreciated.

Best regards, Vijini