Closed majorkazer closed 6 years ago
Hello @majorkazer,
I recommend updating dyno and all its dependencies. I made a few changes to dynwrap so that it would print which prior information is being used.
devtools::install_github("dynverse/dyno", force = TRUE, dependencies = TRUE)
Do the explanations provided below provide an answer to your questions?
Kind regards, Robrecht
Is there a key to know which methods require what data a priori?
Yes there is, although it should be presented a bit more nicely. The dynmethods::methods
object contains various information on all TI methods currently implemented in dyno. However, the inputs
are list objects, and need to be processed first in order to make it more legible.
library(dyno)
library(tidyverse)
data("methods", package = "dynmethods")
methods %>% select(name, input, output)
# A tibble: 57 x 3
name input output
<chr> <list> <list>
1 Angle <list [2]> <list [2]>
2 CALISTA <list [2]> <list [2]>
3 CellRouter <list [2]> <list [2]>
4 CellTrails <list [2]> <list [2]>
5 cellTree with gibbs <list [3]> <list [2]>
6 cellTree with maptpx <list [3]> <list [2]>
7 cellTree with vem <list [3]> <list [2]>
8 Component 1 <list [2]> <list [2]>
9 DPT <list [3]> <list [2]>
10 ElPiGraph cycle <list [2]> <list [2]>
# ... with 47 more rows
A method can mark prior information either as 'required' or as 'optional'. I extract this information as follows:
methods2 <-
methods %>%
mutate(
required = map_chr(input, ~paste0(.$required, collapse = ", ")),
optional = map_chr(input, ~paste0(.$optional, collapse = ", "))
) %>%
select(id, name, required, optional)
methods2
# A tibble: 57 x 4
id name required optional
<chr> <chr> <chr> <chr>
1 angle Angle expression ""
2 calista CALISTA expression ""
3 cellrouter CellRouter counts, start_id ""
4 celltrails CellTrails expression ""
5 celltree_gibbs cellTree with gibbs expression start_id, groups_id
6 celltree_maptpx cellTree with maptpx expression start_id, groups_id
7 celltree_vem cellTree with vem expression start_id, groups_id
8 comp1 Component 1 expression ""
9 dpt DPT expression start_id, features_id
10 elpicycle ElPiGraph cycle expression ""
It is unclear at what step you add these to the model.
You can add prior information using dynwrap, before the trajectory inference. I will be using an example dataset from the SCORPIUS package for this:
data("ginhoux", package = "SCORPIUS")
dataset <-
wrap_data(
id = "ginhoux",
cell_ids = rownames(ginhoux$expression)
) %>%
add_expression(
counts = round(2 ^ ginhoux$expression - 1), # no counts data is available
expression = ginhoux$expression
) %>%
add_prior_information(
start_id = "SRR1558845"
)
You can check which methods require a start cell as follows:
methods2 %>% filter(grepl("start_id", required))
# A tibble: 9 x 4
id name required optional
<chr> <chr> <chr> <chr>
1 cellrouter CellRouter counts, start_id ""
2 fateid FateID expression, end_id, start_id, groups_id ""
3 paga PAGA counts, start_id groups_id
4 scoup SCOUP expression, groups_id, start_id, end_n ""
5 slicer SLICER expression, start_id features_id, end_id
6 topslam topslam expression, start_id ""
7 urd URD counts, start_id ""
8 wanderlust Wanderlust counts, start_id features_id
9 wishbone Wishbone counts, start_id features_id, end_n
For example, we can run PAGA to infer a trajectory:
traj1 <- infer_trajectory(dataset = dataset, method = ti_paga(), verbose = TRUE)
plot_dimred(traj1)
Executing 'paga' on 'ginhoux'
With parameters: list()
And inputs: counts, start_id
...
Alternatively, you could use one of the methods which can optionally use a certain prior information:
methods2 %>% filter(grepl("start_id", optional))
# A tibble: 8 x 4
id name required optional
<chr> <chr> <chr> <chr>
1 celltree_gibbs cellTree with gibbs expression start_id, groups_id
2 celltree_maptpx cellTree with maptpx expression start_id, groups_id
3 celltree_vem cellTree with vem expression start_id, groups_id
4 dpt DPT expression start_id, features_id
5 merlot MERLoT expression, end_n start_id
6 projected_dpt Projected DPT expression start_id, features_id
7 projected_slingshot Projected Slingshot counts start_id, end_id
8 slingshot Slingshot counts start_id, end_id
In this scenario, you will have to specify that you wish to use extra prior information:
traj2 <- infer_trajectory(dataset = dataset, method = ti_slingshot(), verbose = TRUE)
plot_dimred(traj2)
Executing 'slingshot' on 'ginhoux'
With parameters: list()
And inputs: counts
...
traj3 <- infer_trajectory(dataset = dataset, method = ti_slingshot(), give_priors = "start_id", verbose = TRUE)
plot_dimred(traj3)
Executing 'slingshot' on 'ginhoux'
With parameters: list()
And inputs: counts, start_id
...
Dear rcannood,
Thank you for the detailed response! These notes are quite helpful.
I am having another issue with hdf5, would you like me to start a new issue?
Sam
Glad I could be of help :)
Perhaps it would indeed be best to start a new issue, in case someone else has a similar issue at some point.
I'd like to select some marker genes from my dataset to run trajectory inference on. Is there any way to do this after wrapping? My workaround so far has been to just select the marker columns from my counts and expression matrices prior to wrapping.
Given that several of the methods available through dyno require root cells, or can make use of time point data, it is unclear at what step you add these to the model. Should these always be added after running infer_trajectories? Is there a key to know which methods require what data a priori?