Closed johnbradley closed 1 year ago
For example femur decreased length
and femur elongated
are mutually exclusive.
Using the current code these two are determined to haveweak_exclusivity
:
femur <- get_phenotypes(entity="femur")
femur_decreased_length <- femur[femur$label == "femur decreased length", 'id']
femur_elongated <- femur[femur$label == "femur elongated", 'id']
mutually_exclusive(c(femur_elongated, femur_decreased_length) , progress_bar = FALSE)$dataframe$mutual_exclusivity
[1] weak_exclusivity
If we know the qualities "decreased length" and "elongated" are opposites this should return strong_exclusivity
for the two phenotypes (and any other phenotype pairs "X decreased length" and "X elongated", where X is an anatomical element, e.g., a Uberon term).
@wdahdul @pmabee @uyedaj I am tagging you here so you can review and comment.
@johnbradley I slightly edited the last sentence in your example.
To supply the opposite qualities to the mutually_exclusive()
function how about a data frame with two columns (quality.a
and quality.b
)? This way a user could pass in a list of opposite qualities.
As a simple example elongated is opposite of decreased length so we could create a dataframe like so:
elongated_iri <- "http://purl.obolibrary.org/obo/PATO_0001154"
decreased_length_iri <- "http://purl.obolibrary.org/obo/PATO_0000574"
quality_opposites <- data.frame(
quality.a = c(elongated_iri),
quality.b = c(decreased_length_iri)
)
This data frame would be passed to mutually_exclusive()
like so:
> result <- mutually_exclusive(phenotypes_to_compare, quality_opposites = quality_opposites)
> result$dataframe$mutual_exclusivity
[1] strong_exclusivity
5 Levels: strong_compatibility < weak_compatibility < inconclusive_evidence < ... < strong_exclusivity
If you had additional opposite qualities you could just include more rows to the quality_opposites
data frame.
quality_opposites <- data.frame(
quality.a = c(elongated_iri, chronic_iri, aerobic_iri),
quality.b = c(decreased_length_iri, acute_iri, anaerobic_iri)
)
Yes, I agree in general. More specifically, we should only require that these two columns be present, not, for example, that they be the only columns. (If someone were to maintain such a table by hand, they would likely want to include labels as additional columns so they can more easily remember what's in the table. We shouldn't force them to massage the table each time before using it as input here. In the same vein, we should auto-trim the IRIs to remove trailing spaces, because if someone keeps this in Excel, extraneous trailing or leading spaces will creep in sooner or later and appear in the CSV export.)
Enhance the mutual exclusion functionality to allow supplying opposite qualities. This data should be used in determining the exclusivity type (strong_compatibility, weak_compatibility, inconclusive_evidence, weak_exclusivity, strong_exclusivity).
This will build on the work done for #237