ropensci / phylotaR

An automated pipeline for retrieving orthologous DNA sequences from GenBank in R
https://docs.ropensci.org/phylotaR
Other
23 stars 8 forks source link

Description - not enough information #56

Open Bunholi opened 3 years ago

Bunholi commented 3 years ago

Hi,

I have been facing problems to diagnosis the gene/marker that my clusters represent because the description and feature variables do not bring as much information. We know that Genbank is a mess and, unfortunately, people give a lot of information in the description and the 2 most common words in my phylotaR description are usually not the information that I need.

Here follows an example:

Screen Shot 2021-05-20 at 11 49 00 AM

The description just brings gene(0.1), mitochondrial (0.1), which doesn't bring the information that I need. I got one of the "seed" and look for it on Genbank to see the gene that cluster refers to and I realized that it refers to NADH gene, but without looking for it, I would not able to recognize it "Acipenser persicus persicus isolate AP1 NADH dehydrogenase subunit 5 gene, partial cds; mitochondrial gene for mitochondrial product".

So I would like to know whether is possible to bring more words in this "Description" variable.

Thank you,

Ingrid Bunholi