Institution/organization info via rentrez

ropensci / rentrez

talk with NCBI entrez using R

https://docs.ropensci.org/rentrez

Other

194 stars 38 forks source link

Institution/organization info via rentrez #189

Open trilisser opened 1 year ago

trilisser commented 1 year ago

Good day! A GenBank info page contains data on the organization made sequencing. Can it be extracted via rentrez not downloading an entire gb file with db="nuccore", rettype = "gb"?

Best

allenbaron commented 1 year ago

Not sure I understand your request. Are you just asking if you can get the journal reference for a particular record? I don't think that's possible but you can get the whole record with entrez_fetch().

To get the record you provided, as an example, you would execute x <- rentrez::entrez_fetch(db = "nuccore", id = "MZ413793", rettype = "text"), which returns only ~ 3 KB of data. You can choose whatever return type is most convenient to you and then extract the information of interest. Try ?entrez_fetch for more info.

trilisser commented 1 year ago

Sorry for the late response. I want to get the information that I highlighted by red square without downloading an entire record (for example by rentrez_summary which allow me to get collection date, strain name, etc. without downloading a whole record), because this takes time in the case of houndreds of records. Also, as far as I understand, the area highlighted is not a journal reference, it is information on an institute which provide a genome sequence.

allenbaron commented 1 year ago

If the data you want is not in entrez_summary(), the only option I'm aware of is entrez_fetch() and I do not think a specific field can be specified in the that is similar to what can be done with entrez_link(). Maybe a different approach can be found by searching through some examples in the book: https://www.ncbi.nlm.nih.gov/books/NBK179288/.

trilisser commented 1 year ago

Thank you very much for your help! I'll explore your link.