xfengnefx / hifiasm-meta

hifiasm_meta - de novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.
MIT License
60 stars 8 forks source link

Is it necessary to conduct binning after assembly with HiFi reads to get MAGs? #19

Closed ye00ye closed 2 years ago

ye00ye commented 2 years ago

Hello, xfengnefx!

With NGS shotgun reads, to get MAGs we usually assemble pair-end reads into contigs, and then recover MAGs through binning.

What I want to ask is that for HiFi reads, in order to get MAGs with higher quality whether it is necessary to conduct binning after we get contigs using hifiasm-meta?

Thanks for your helping.

xfengnefx commented 2 years ago

@ye00ye (not sure if you can get notifications on closed issue so I'm pinging)

Binning is still useful, but it looks less efficient than what you get in NGS. Usually a good bin will only contain 1-3 long contigs for HiFi assemblies (tried metabat2 and vamb). Bins with more contigs are likely to be contaminated. I'm not sure why. I haven't tried the newer ones.

The current HEAD has a rescue heuristic based on contig graph topology. Checkout the prefix.rescue.fa file if you ran it. Each fasta entry in this file is guessed to be a circular(ized) MAG. This is error-prone and you definitely need to checkM these rescues.

We are preparing a manuscript on this and subsequent evaluation methods. The preprint probably will come out a couple weeks from now. I'll push some changes by then, too.

ye00ye commented 2 years ago

Thanks for your help, so it is still necessary to conduct binning with long HiFi contigs at current stage.

ye--ye

@. | ---- Replied Message ---- | From | @.> | | Date | 9/16/2022 02:33 | | To | @.> | | Cc | @.> , @.***> | | Subject | Re: [xfengnefx/hifiasm-meta] Is it necessary to conduct binning after assembly with HiFi reads to get MAGs? (Issue #19) |

@ye00ye (not sure if you can get notifications on closed issue so I'm pinging)

Binning is still useful, but it looks less efficient than what you get in NGS. Usually a good bin will only contain 1-3 long contigs for HiFi assemblies (tried metabat2 and vamb). Bins with more contigs are likely to be contaminated. I'm not sure why. I haven't tried the newer ones.

The current HEAD has a rescue heuristic based on contig graph topology. Checkout the prefix.rescue.fa file if you ran it. Each fasta entry in this file is guessed to be a circular(ized) MAG. This is error-prone and you definitely need to checkM these rescues.

We are preparing a manuscript on this and subsequent evaluation methods. The preprint probably will come out a couple weeks from now. I'll push some changes by then, too.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>