Closed kokyriakidis closed 3 years ago
Hi @kokyriakidis , I'm working on relaxing the requirement for gene identifiers (e.g. ENSG00000000003) and gene symbols (e.g. TSPAN6) to be defined in the rowRanges
of the DESeqDataSet
. Typically I define the genome in the bcbioRNASeq()
call, which then fetches the genome annotations via AnnotationHub internally using the makeGRangesFromEnsembl()
function.
For example:
library(bcbioRNASeq)
bcb <- bcbioRNASeq(
organism = "Homo sapiens",
genomeBuild = "GRCh38",
ensemblRelease = 100L
)
Internally, this function hands off to makeGRangesFromEnsembl
:
library(basejump)
gr <- makeGRangesFromEnsembl(
organism = "Homo sapiens",
genomeBuild = "GRCh38",
release = 100L
)
class(gr)
## [1] "GRanges"
## attr(,"package")
## [1] "GenomicRanges"
The identifiers and names are defined in the mcols
of the GRanges
object:
> mcols(gr)[["geneID"]]
character-Rle of length 68008 with 68008 runs
Lengths: 1 1 ... 1
Values : "ENSG00000000003" "ENSG00000000005" ... "LRG_999"
> mcols(gr)[["geneName"]]
character-Rle of length 68008 with 68001 runs
Lengths: 1 1 1 ... 1 1
Values : "TSPAN6" "TNMD" "DPM1" ... "CCND3" "CIC"
Resolved in DESeqAnalysis 0.3.12 update. Thanks for posting this!
Hello @mjsteinbaugh
I get this error. How can I fix this?
I use the following command:
EDIT:
I had to use these lines of code in order to get it work
This was not the case for other bcbio runs I tried