dynverse / dyno

Inferring, interpreting and visualising trajectories using a streamlined set of packages 🦕
https://dynverse.github.io/dyno
Other
167 stars 32 forks source link

Correct usage of prior information with TI methods #52

Closed jma1991 closed 5 years ago

jma1991 commented 5 years ago

I have inferred a trajectory with dyno using slingshot between 3 groups of cells, however prior knowledge suggests that the trajectory should go from groups A -> B -> C where as slingshot has returned A -> B and A -> C (i.e a bifurcation). According to the dyno guidelines, slingshot is able to make use of prior information (e.g. start/end states) however, whatever priors I provide slingshot just gives me the same trajectory. Is this expected behaviour or am I doing something wrong?

Below is an example of the commands I would use to run slingshot with a hypothetical dataset containing 3 cells per group:

dat <- add_prior_information(
  dataset = dat
  start_id = c("A1, "A2", "A3"),
  end_id = c("C1", "C2", "C3"),
  groups_id = data.frame(cell_id = c("A1, "A2", "A3", "B1", "B2", "B3", "C1", "C2", "C3"),
                         group_id = c("A", "A", "A", "B", "B", "B", "C", "C", "C"),
  groups_network = data.frame(from = c("A", "B"), to = c("B", "C"))
  start_n = 1,
  end_n = 1
)

mod <- infer_trajectory(dat, "slingshot", give_priors = c("start_id", "end_id", "groups_id", "groups_network", "start_n", "end_n"), seed = 1701)

I am using the following package versions:

rcannood commented 5 years ago

Hi James! Thanks for your message. You brought to light a bug that was introduced in one of the latest versions. I have written a solution for it, am waiting for travis to finish checking the package before merging it into the master branch.

rcannood commented 5 years ago

The problem should have been fixed. Could you update dynwrap to version 1.1.3 and try again?

jma1991 commented 5 years ago

Downloaded version 1.1.3 but it hasn't resolved my problem. This is how I specified the prior information in my dataset:

# Add prior information --------------------------------------------------------

dat <- add_prior_information(
  dataset = dat,
  start_id = colnames(sce)[sce$cell_type == "PSM"],
  end_id = colnames(sce)[sce$cell_type == "Neu"],
  groups_id = data.frame(cell_id = colnames(sce), group_id = sce$cell_type, stringsAsFactors = FALSE),
  groups_network = data.frame(from = c("PSM", "NMP", "NMP"), to = c("NMP", "Neu", "Som"), stringsAsFactors = FALSE)
)

# Inferring trajectories  ------------------------------------------------------

mod <- infer_trajectory(dat, "slingshot", give_priors = c("start_id", "end_id", "groups_id", "groups_network"), seed = 1701)

Also when you specify a data frame for groups_id and groups_network and it contains factors you get the following error:

> dat <- add_prior_information(
+     dataset = dat,
+     start_id = colnames(sce)[sce$cell_type == "PSM"],
+     end_id = colnames(sce)[sce$cell_type == "Neu"],
+     groups_id = data.frame(cell_id = colnames(sce), group_id = sce$cell_type),
+     groups_network = data.frame(from = c("PSM", "NMP", "NMP"), to = c("NMP", "Neu", "Som"))
+ )

Error: all(groups_id$group_id %in% c(groups_network$to, groups_network$from)) isn't true.

I think it occurs because when factors are concatenated (like in your check) they are changed into a numeric value:

> tmp <- data.frame(from = c("PSM", "NMP", "NMP"), to = c("NMP", "Neu", "Som"))
> c(tmp$from, tmp$to)
[1] 2 1 1 2 1 3
rcannood commented 5 years ago
Error: all(groups_id$group_id %in% c(groups_network$to, groups_network$from)) isn't true.

Thanks, this should indeed produce a more useful error warning, or none at all.

For the start_id and the end_id, you should provide just one cell, not all of them. If you use sample(., 1), does this solve your problem?

Slingshot can't use groups_id and groups_network, so providing these are unnecessary.

rcannood commented 5 years ago

Hey James,

I'm assuming this issue is solved. If not, feel free to reply to this issue again, I'll reopen it.

Robrecht