Open thackl opened 1 month ago
When parsing gbk files for sequence length, the "LOCUS" line is split not just on white space, but any non-alphanumeric character. I.e. "LOCUS scaffold_20 50000 bp ..." gives seq_id="scaffold", length=20...
"LOCUS scaffold_20 50000 bp ..."
seq_id="scaffold", length=20
https://github.com/thackl/gggenomes/blob/976bb831975b505964086a19dd5371163abec991/R/read_seqs.R#L94
When parsing gbk files for sequence length, the "LOCUS" line is split not just on white space, but any non-alphanumeric character. I.e.
"LOCUS scaffold_20 50000 bp ..."
givesseq_id="scaffold", length=20
...https://github.com/thackl/gggenomes/blob/976bb831975b505964086a19dd5371163abec991/R/read_seqs.R#L94