xingjianleng / DBGA

The repository for the genome sequence alignment research project
BSD 3-Clause "New" or "Revised" License
3 stars 1 forks source link

Analyse the corona virus sequence dataset to determine the appropriate k size for de Bruijn graph #6

Closed xingjianleng closed 1 year ago

xingjianleng commented 2 years ago

Check example sequences in the corona virus dataset, and determine the k value for the de Bruijn graph, so that, 1). No duplicate k-mer exists; 2). 1% of total nodes duplicate; 3). 5% of total k-mers duplicate.