Closed cmirzayi closed 1 year ago
Hi @cmirzayi - sorry for the delay.
This might be a misconception of what the function is supposed to do. From the specification in the vignette:
"More specifically, subsetting BugSigDB signatures by an EFO term then involves subsetting the Condition column to all descendants of that term in the EFO ontology and that are present in the Condition column"
Now what are the descendants of bipolar disorder in the EFO ontology?
> efo$children[["MONDO:0004985"]]
[1] "EFO:0009963" "EFO:0009964"
> efo$name[c("EFO:0009963","EFO:0009964")]
EFO:0009963 EFO:0009964
"bipolar I disorder" "bipolar II disorder"
Are any of the descendants present in the Condition
column?
> c("bipolar I disorder", "bipolar II disorder") %in% dat$Condition
[1] FALSE FALSE
So far so correct. Now one could argue whether one would want to include all signatures associated with the term itself when subsetting, but in this case it would be more straightforward to just subset via:
> dat.bpd <- subset(dat, Condition == "bipolar disorder")
> dim(dat.bpd)
[1] 18 48
The most common use case I would think for subsetByOntology
would be to choose the ontology term and all of its descendants. How is that use case supported? I also find it unintuitive that subsetByOntology
would return only descendants, without even an argument option to include the term itself.
Yes it would make sense to add that option.
Included the term itself in the query as of bugsigdbr v1.7.3. Available from github and bioc devel only for the moment.
> library(bugsigdbr)
> dat <- bugsigdbr::importBugSigDB()
Using cached version from 2023-04-26 22:22:42
> efo <- bugsigdbr::getOntology("efo")
Loading required namespace: ontologyIndex
Using cached version from 2022-09-06 23:05:35
> dat.bpd <- bugsigdbr::subsetByOntology(dat, column = "Condition", "bipolar disorder", efo)
> dim(dat.bpd)
[1] 18 48
> table(dat.bpd$Condition)
bipolar disorder
18
Nice!!
Awesome thank you Ludwig!
It appears that
subsetByOntology()
does not return terms with the designationMONDO
, though these terms are present in the EFO. The EFO includes many terms with the prefix MONDO such asMONDO:0004985
(bipolar disorder) which is a unique term from the EFO terms "bipolar I disorder" and "bipolar II disorder" since it is inclusive of both these terms. BugSigDB articles, when curated, accept MONDO terms for curation.Reproducible example:
An empty object is returned when subsetting by bipolar disorder, despite confirming it is both present in
dat
andefo
:This behavior is not observed when subsetting by a similar condition--unipolar depression--which is an EFO-prefixed term:
Session Info