Open lskatz opened 4 years ago
This format is used for Sequin for submitting sequences to genbank, but it has also turned up in the VADR package from NCBI most recently.
I think there is a Bio::FeatureIO::table
but I'm not sure whether that was developed for this particular NCBI format.
Sorry, was mistaken. We do have a Bio::SeqIO::table
but that doesn't mention anything about NCBI's table format. Saying that, it's possibly you could look at the structure for that one to build from.
I would also look at the tools Jon Palmer has developed in @nextgenusfs https://github.com/nextgenusfs/funannotate which is python based but has some parsing of these tables to truncate and cleanup when we need to remove contigs or filter out contam overlapping regions.
Hi, I was wondering if there was any way to parse the NCBI Sequin tbl format? It is defined here: https://www.ncbi.nlm.nih.gov/projects/Sequin/table.html
I don't think I see any parser for it but I wanted be sure before writing my own. Thank you!
And the example starts like this.