dfguan / purge_dups

haplotypic duplication identification tool
MIT License
202 stars 19 forks source link

Purge_dups cuttofs on a scaffolded genome #120

Open shailij246 opened 1 year ago

shailij246 commented 1 year ago

Hi I am using purge dups on a scaffolded genome as we realized the genome had many duplications after scaffolding was done.. First is this a huge issue and should the assembly be purged before scaffolding? Second, I got this graph using hist_plot.py and the calcults.log said mean not different from peak, treat as haploid assembly. What would you recommend as cutoffs based on this graph? The default cutoffs used were : 5 3 5 6 10 18. do they seem ok? GRS_hap2_26July'22_PB cov

xiekunwhy commented 1 year ago

Hi,

You can try kmerDedup (https://github.com/xiekunwhy/kmerDedup) first than run purge_dups.

Best, Kun