Closed ellalalalalalala closed 1 year ago
Hi @ellalalalalalala
I think for NCBI annotations you might better off using OrgDb. It will usually be the most current build, so using this will get you hg38 which it looks like you want? You will see that the data is current from Sept 2021.
query(ah, c("Homo sapiens", "OrgDb"))
AnnotationHub with 1 record
# snapshotDate(): 2021-10-20
# names(): AH95959
# $dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Homo sapiens
# $rdataclass: OrgDb
# $rdatadateadded: 2021-10-08
# $title: org.Hs.eg.db.sqlite
# $description: NCBI gene ID based annotations about Homo sapiens
# $taxonomyid: 9606
# $genome: NCBI genomes
# $sourcetype: NCBI/ensembl
# $sourceurl: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/, ftp://ftp.ensembl.org/pub/current_fasta
# $sourcesize: NA
# $tags: c("NCBI", "Gene", "Annotation")
# retrieve record with 'object[["AH95959"]]'
Hope this helps!
Hi,
thanks a lot for the fantastic workshop for DGE analyses, I really enjoy it and learned a lot. :)
I am now trying to run analyses with my own data. For previous SNP analyses and now the Salmon quantification I used the NCBI RefSeq Transcripts FASTA (https://www.ncbi.nlm.nih.gov/genome/guide/human/). Thus, I am trying to build my tx2gene annotation file from the NCBI annotation. Would you have a recommendation, which ah$dataprovider to query? Is there anything else I should adapt/ keep my eyes on, compared to the presented workflow using ensembldb?
Thanks a lot in advance for your help. :)
Best wishes, Ella