mikolmogorov / Flye

De novo assembler for single molecule sequencing reads using repeat graphs
Other
789 stars 168 forks source link

Issue in assembly CRISPR insert #531

Closed yzhang-github-pub closed 2 years ago

yzhang-github-pub commented 2 years ago

Dear Author,

We use ONT amplicon data to detect target insertion by CRISPR. We expect at least 2 alleles: WT (no insert) and with insert: WT: upstream flank Cas9 cutsite (1kb) + downstream flank (1kb), total 2kb amplicon Insert: upstream + insert (10kb) + downstream, total 12kb amplicon

WT:Insert nanopore reads obtained are about 5:1 ratio (due to PCR bias towards shorter WT reads). But only one contig (~12kb) supporting Insert allele is obtained, no contig for WT although more reads supporting WT. I used default settings for flye and also tried different values (including 2000, 12000) for --genome-size

I tried hapdup without any success.

Can you advise what parameters of flye I can try? Thanks!

mikolmogorov commented 2 years ago

Hello,

Hapdup likely fails to phase reads, since it expects variants with ~50% frequency. And Flye can only produce haploid assembly.

You can try running with --meta --keep-haplotypes, which should retain the insertion sequence on the assmebly graph. But I don't see an easy way to automatically assemble both alleles from this data.

mikolmogorov commented 2 years ago

Marking as resolved due to inactivity.