FelixTheStudent / cellpypes

Cell type pipes for R
GNU General Public License v3.0
51 stars 3 forks source link

Function: group_classes (OR logic for rules) #8

Open FelixTheStudent opened 3 years ago

FelixTheStudent commented 3 years ago

Currently all rules are combined with AND, but there might be cases where OR is useful.

Motivation Use-case 1: For example, markers a1 and a2 might have nice but imperfect overlap, and both mark cell type A. If A has subtypes it may well be we'd want to set the subtype's parent to either a1 OR a2. For this, introducing a group called "A" as a kind of virtual class (no own rules) would help.

Use-case 2: for plotting, we might want to have all myeloid cells in the same colors. Introducing the "myeloid" class could achieve this:

obj %>%
rule("mono", "CD14", .1) %>%
rule("DC",   "CD1C", .1) %>%
group_classes("myeloid"=c("mono","DC") %>%
rule("T",    "CD3E", .1) %>%
plot_classes( c("myeloid", "T"))

Since mono and DC have no common parent, this can not be achieved without the group_classes function.

How to implement obj$classes currently has "class" and "parent" columns. parent is either a class name or the special key word "..root..". I propose to create the second key word "..group.." and the slot obj$class_groups; this could be a named list (name is group name, elements are classes belonging to the group.

FelixTheStudent commented 2 years ago

I thought more about the API. I think the most intuitive is a group argument in the rule function: obj %>% rule("mono", "CD14", ">", 1e-4, group="myeloid").

More thoughts:

This feature might be fun to implement! I'll save it for a rainy day, and then I'll be like git checkout -b group and coffee!

FelixTheStudent commented 2 years ago

Don't forget this check:

if(any(group %in% obj$classes$class)) stop( group ": You created a class AND a group for this cell type. That's ambiguous, please double-check your rules!")

FelixTheStudent commented 2 years ago

One more requirement:

has to be compatible with grouping existing labels, see #16 .

That probably means an argument to "rule" alone does not cut it, I'll need a separate function, right? Something like this: class_group(obj, group="lymphocytes", classes=c("T","B")

FelixTheStudent commented 2 years ago

I feel this might still be a relevant feature and should start collecting use-cases. Any ideas, anyone?