ekg / seqwish

alignment to variation graph inducer
MIT License
143 stars 18 forks source link

checked duplicate ids in FASTA or FASTQ efficiently #59

Closed AndreaGuarracino closed 4 years ago

AndreaGuarracino commented 4 years ago

the names of the sequences are read from the temporary file which store them (before it is deleted), one name at a time, checking how many times the pattern (the name of the sequence) appears in the CSAWT data structure