crosenth / medirect

multiprocessed ncbi edirect
GNU General Public License v3.0
5 stars 3 forks source link

ftract search for features beyond line 2 #2

Open dhoogest opened 6 years ago

dhoogest commented 6 years ago

This might not be an issue for feature tables from genbank, but for other sources (for example from prokka), a particular row may contain more than two lines. For example in the following it would be useful for the rrna:product:16s search to return 3340-4882.

64422   64003   CDS
                        EC_number       3.1.-.-
                        gene    yrrK
                        inference       ab initio prediction:Prodigal:2.6
                        inference       similar to AA sequence:UniProtKB:O34634
                        locus_tag       PKACPDCM_00479
                        product Putative pre-16S rRNA nuclease
64692   64426   CDS
                        inference       ab initio prediction:Prodigal:2.6
                        locus_tag       PKACPDCM_00480
                        product hypothetical protein
--
3288    3214    tRNA
                        inference       COORDINATES:profile:Aragorn:1.2
                        locus_tag       PKACPDCM_01974
                        product tRNA-Ala(tgc)
4882    3340    rRNA
                        locus_tag       PKACPDCM_01975
                        product 16S ribosomal RNA
                        score   0
crosenth commented 6 years ago

What is the accession.version for this ft?

dhoogest commented 6 years ago

This was from an annotation file generated by prokka from an in-house assembly (not an NCBI record). I solved in a local pipeline script, so not necessary to build into ftract unless you want to generalize it for use beyond NCBI

crosenth commented 6 years ago

Are you describing the .tbl Feature Table as described in Table 2 here: https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btu153

If so I would say ftract should be able to parse it. Can you send me the full Pokka Feature Table file so I look closer?

dhoogest commented 6 years ago

Sure - see for example /molmicro/working/dhoogest/src/genome_id/171114_typestrains/385_24/prokka/385_24.tbl

crosenth commented 6 years ago

Okay please install 0.5.0 and let me know

crosenth commented 6 years ago

Okay now try 0.6.0 and let me know