Closed llrs closed 4 years ago
Your first try to modify the set and convert to a fuzzy set you'll need to assign the modified ecoli_set to be able to use the fuzzy values (or pipe it). The TidySet is of class S4, not S6, and the modifications are not in place.
ecoli_sets %>%
mutate(
fuzzy = case_when(sets == "GL" ~ 0.2, # Add fuzzy values
sets == "CF" ~ 0.8)) %>%
set_size()
Yes, the GL has 0.04 probability to have two genes so the glycolysis pathway probably don't have any genes. I think this is different from cardinality, as cardinality it is just a number for a set, while here I return several values for a single set.
I'm not sure I follow on this cardinality. For example the {sets}
package mentioned provides cardinality as the number of elements in a set regardless of the fuzzy value. However I will provide a new method to calculate cardinality and allow to provide a fuzzy logic function (lengths
as in {sets}
or sum
as you found, or other functions).
Created a toy example of E. coli.
Check union/intersect/negation functions - worked as expected
(Optional change) Could change parameter
set = c("GL", "CL"
to the alternativesets = ...
and match the tidySet headerelements
,sets
,fuzzy
.Fuzzy sets - some weird behavior Noticed some weird behavior in
set_size
function.Wait ... the fuzzy size shouldn't match the not-fuzzy size ... I'll create the same fuzzy set following the instructions in the README.md and check its size.
Any explanation appreciated. Looking at the table, the fuzzy values are being treated as probabilities. Last line of the table shows that GL (glycolysis) has 2 genes with a probability of 0.04 which is equal to 0.2 x 0.2. So the biological interpretation is that "gene:1" is classified as “Glycolysis” 0.2 (20%) of the time? And ditto for "gene:2"? Ergo the most likely outcome is that “Glycolysis” has 0 genes with a probability of 0.64 and “Carbon fixation” has 3 genes with probability of 0.512. Hmm, I'll think on the interpretation of that probability column, but seems to matches the example in the Fuzzy vignette. Searched for papers on fuzzy-set size/cardinality that used probabilities, but all searches redirected me to a sum of fuzzy values (0.2 + 0.2). Adding a description on how the probability column is calculated would be sufficient documentation to avoid confusion.