Closed cying111 closed 4 years ago
To Be truthful, I'm actually surprised it worked before. The query function queries the mcols of a hub object and does not normally include the dispatchclass which from your metadata file would be the only indication of explicitly "FaFile". I would suggest changing to FASTA which would still give you the information you desire.
> query(hub, c("NanoporeRNA", "GRCh38"))
ExperimentHub with 7 records
# snapshotDate(): 2020-10-02
# $dataprovider: SGNex
# $species: Homo sapiens
# $rdataclass: vector
# additional mcols(): taxonomyid, genome, description,
# coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
# rdatapath, sourceurl, sourcetype
# retrieve records with, e.g., 'object[["EH3808"]]'
title
EH3808 | K562_directcDNA_replicate1
EH3809 | K562_directcDNA_replicate4
EH3810 | K562_directRNA_replicate6
EH3811 | MCF7_directcDNA_replicate1
EH3812 | MCF7_directcDNA_replicate3
EH3813 | MCF7_directRNA_replicate4
EH3814 | Hs_GRCh38_chr22_1_25409234_fasta
You can see the columns that are queried with mcols
mcols(query(hub, c("NanoporeRNA", "GRCh38")))
And as mentioned it seems like changing FaFile to FASTA will give your desired result. If this is used inside of your package code, it is probably recommended to use the EH_id number for retrieval, so then you know absolutely which file you are retrieving instead of relying on queries.
> hub['EH3814']
ExperimentHub with 1 record
# snapshotDate(): 2020-10-02
# names(): EH3814
# package(): NanoporeRNASeq
# $dataprovider: SGNex
# $species: Homo sapiens
# $rdataclass: vector
# $rdatadateadded: 2020-10-02
# $title: Hs_GRCh38_chr22_1_25409234_fasta
# $description: Sequences of region chr22 1 to 25409234 in human GRCh38 DNA ...
# $taxonomyid: 9606
# $genome: GRCh38
# $sourcetype: FASTA
# $sourceurl: https://github.com/GoekeLab/sg-nex-data
# $sourcesize: NA
# $tags: c("ExperimentHub", "RNASeqData", "SequencingData")
# retrieve record with 'object[["EH3814"]]'
Cheers,
Hi, for our package NanoporeRNASeq, we have uploaded this FaFile to experimentHub, and it worked fine with R4.0.0 and R4.0.2 in extracting FaFile using this line:
genomeSequence <- query(ExperimentHub(), c("NanoporeRNA", "GRCh38", "FaFile"))
However, when using R4.0.3 to do the same thing:
genomeSequence <- query(ExperimentHub(), c("NanoporeRNA", "GRCh38", "FaFile"))
it seemed to produce empty record.Does anyone know why? Thank you Ying