Bioconductor / AnnotationHub

Client for the Bioconductor AnnotationHub web resource
15 stars 12 forks source link

Non-descriptive error #27

Closed BaylieRW closed 2 years ago

BaylieRW commented 2 years ago

myhub = AnnotationHub() snapshotDate(): 2021-05-18 getInfoOnIds(myhub, "AH72154") myhub_id fetch_id title rdataclass status biocversion rdatadateadded rdatadateremoved 288111 AH72154 78900 org.Salmo_salar.eg.sqlite OrgDb Public 3.9 2019-05-02 NA file_size 288111 161341440 myhub[["AH72154"]] Error: Public

Hiya, the db is present as can be seen above, but I'm not sure what this error message means?

lshep commented 2 years ago

sorry. yes I need to improve the error warnings for org packages. orgDb packages are updated per release so likely the orgDb that you wish to access is too old for your version of R/Bioconductor. which if we query for your species, indeed there are more recent versions with more accurate information

> query(myhub, "org.Salmo")
AnnotationHub with 9 records
# snapshotDate(): 2021-09-23
# $dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Salmo tshawytscha, Salmo trutta, Salmo salar, Salmo nerka, Salmo...
# $rdataclass: OrgDb
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["AH93861"]]' 

            title                          
  AH93861 | org.Salmo_mykiss.eg.sqlite     
  AH93874 | org.Salmo_kisatch.eg.sqlite    
  AH93875 | org.Salmo_trutta.eg.sqlite     
  AH93881 | org.Salmo_salar.eg.sqlite      
  AH93888 | org.Salmo_tshawytscha.eg.sqlite
  AH93896 | org.Salmo_namaycush.eg.sqlite  
  AH93905 | org.Salmo_alpinus.eg.sqlite    
  AH93910 | org.Salmo_nerka.eg.sqlite      
  AH93913 | org.Salmo_keta.eg.sqlite       
BaylieRW commented 2 years ago

Thank you! I was just trying to run a script provided in a 2021 paper to replicate the methods - didn’t think about that it was probably carried out much prior to 2021! My fault!


From: lshep @.> Sent: 05 October 2021 13:40 To: Bioconductor/AnnotationHub @.> Cc: BaylieRW @.>; Author @.> Subject: Re: [Bioconductor/AnnotationHub] Non-descriptive error (#27)

sorry. yes I need to improve the error warnings for org packages. orgDb packages are updated per release so likely the orgDb that you wish to access is too old for your version of R/Bioconductor. which if we query for your species, indeed there are more recent versions with more accurate information

query(myhub, "org.Salmo") AnnotationHub with 9 records

snapshotDate(): 2021-09-23

$dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/

$species: Salmo tshawytscha, Salmo trutta, Salmo salar, Salmo nerka, Salmo...

$rdataclass: OrgDb

additional mcols(): taxonomyid, genome, description,

coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,

rdatapath, sourceurl, sourcetype

retrieve records with, e.g., 'object[["AH93861"]]'

        title

AH93861 | org.Salmo_mykiss.eg.sqlite AH93874 | org.Salmo_kisatch.eg.sqlite AH93875 | org.Salmo_trutta.eg.sqlite AH93881 | org.Salmo_salar.eg.sqlite AH93888 | org.Salmo_tshawytscha.eg.sqlite AH93896 | org.Salmo_namaycush.eg.sqlite AH93905 | org.Salmo_alpinus.eg.sqlite AH93910 | org.Salmo_nerka.eg.sqlite AH93913 | org.Salmo_keta.eg.sqlite

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Bioconductor/AnnotationHub/issues/27#issuecomment-934373709, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVGXUF4EW4SPFUTTCEFNTQTUFLW4NANCNFSM5FHJXXNA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

famfigueiredo commented 9 months ago

Can I piggyback off of this issue? I am currently working with Atlantic salmon, and I did some functional analysis last year in February based off of the OrgDb record that was available at the time. I am repeating the analysis now with a different record (AH111638), and I am getting very different results in terms of number of GO terms picked up in Over Representation Analysis.

Am I able to see if this more recent record has replaced the old one? I do not remember the reference for the old one, nor did I write it down anywhere since I used to create the object by doing sasa <- query(ah, c('OrgDb', 'Salmo salar'))[[1]].

lshep commented 9 months ago

We replace OrgDbs every release to have updated information. OrgDbs are closely associated with the Bioconductor release version and R version. You can tell the date of the added resource by the rdatadateadded in the query information

> query(ah, c('OrgDb', 'Salmo salar'))
AnnotationHub with 1 record
# snapshotDate(): 2023-10-05
# names(): AH111638
# $dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Salmo salar
# $rdataclass: OrgDb
# $rdatadateadded: 2023-04-24
# $title: org.Salmo_salar.eg.sqlite
# $description: NCBI gene ID based annotations about Salmo salar
# $taxonomyid: 8030
# $genome: NCBI genomes
# $sourcetype: NCBI/UniProt
# $sourceurl: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/, ftp://ftp.uniprot.org/p...
# $sourcesize: NA
# $tags: c("NCBI", "Gene", "Annotation") 
# retrieve record with 'object[["AH111638"]]' 

To replicate the analysis you would have to use the same version of R and Bioconductor used at the time. Likely Bioconductor 3.16

> temp = ah[["AH107424"]]
Error: AH107424 is an OrgDb resource.
  orgDb resources are generated for specific biocversions.
  Requested resource works with biocversion: 3.16
  To find a resource appropriate for your biocversion try the following query:
      query(ah,'org.Salmo_salar.eg.sqlite')

As you can see the ERROR message for the OrgDb has also been updated to be more descriptive and what version would likely be appropriate to be able to replicate the findings.