marbl / verkko

Telomere-to-telomere assembly of accurate long reads (PacBio HiFi, Oxford Nanopore Duplex, HERRO corrected Oxford Nanopore Simplex) and Oxford Nanopore ultra-long reads.
307 stars 30 forks source link

Using phasing blocks generated by hifiasm+Hi-C #136

Closed ritatam closed 1 year ago

ritatam commented 1 year ago

Hello developers,

I've generated verkko assemblies for a diploid fungus using nanopore duplex as --hifi and ultralong simplex as --nano. I want to integrate Hi-C data, so I generated haplotype-phased assemblies using hifiasm + Hi-C as suggested by this thread.

In the manual you wrote "make sure the phase blocks are chromosome-scale and consistent within each chromosome". So does that mean I'll need to scaffold these phased hifiasm haplotigs (with other Hi-C scaffolders?), extract haplotype-specific k-mers, then feed them to verkko? Could you please give me some suggestions? I'm new to genome assembly so not sure how to move on from there.

Thank you!

Rita

skoren commented 1 year ago

That documentation is a bit outdated, we've been working on natively mapping Hi-C data to the verkko graph. We haven't integrated it into the pipeline yet but you can try an early version here: https://github.com/skoren/verkkohic. Run the gfase_wrapper.sh script. It also requires GFAse: https://github.com/rlorigro/GFAse. To run, you need to have a folder structure like this:

  unphased_verkkoasm
  hifi/<all fasta/q inputs for hifi>
  ont/<all fasta/q inputs for ont>
  hic/<all fasta/q inputs for hic>

and you run it as:

export VERKKO=<path to verkko-v1.3.1/>
export GFASE=<path to gfase>
bash gfase_wrapper.sh  unphased_verkkoasmasm gfase_verkkoasm `pwd`  

One caveat with this is we've done limited testing outside of human/primates so you may encounter issues which you can report on the verkkohic GitHub page. We should hopefully have an integrated version of Hi-C in the verkko pipeline within a month.

ritatam commented 1 year ago

@skoren Hi Sergey, many thanks for the pointers! I've given it a try. Just had a brief look at the results, each haplotype seems to be much smaller than expected. I'll have a further look into this and report issues on verkkohic page as you said.

Thank you! Rita

skoren commented 1 year ago

OK, I'll close this issue here then.