HectorRDB / condiments

Trajectory inference across multiple conditions with condiments: differential topology, progression, differentiation, and expression
https://hectorrdb.github.io/condiments/
Other
24 stars 3 forks source link

Question on tutorial "Analysis of the fibrosis datataset" #19

Closed france-hub closed 1 year ago

france-hub commented 1 year ago

Hello,

Thanks for all of your packages for trajectory inference, they help a lot! I am writing not because of a real issue but for a question related to one of your tutorials.

I have a dataset of T cells and running Slingshot I identify two main lineages. This is the piesce of code I used:

sceCD8 <- as.SingleCellExperiment(CD8, assay = "RNA")
sds <- slingshot(sceCD8, clusterLabels = clusterLabels, 
                  allow.breaks = TRUE, stretch = 2, reducedDim = "UMAP", start.clus = "Naive") 
sds <- SlingshotDataSet(sds)

Then I would like to plot the two lineage I obtain onto the same UMAP. To do so I tried to use the tutorial https://hectorrdb.github.io/condimentsPaper/articles/Fibrosis.html and did:

df <- bind_cols(
  as.data.frame(reducedDim(sds, "UMAP")),
  slingPseudotime(sds) %>% as.data.frame() %>%
    dplyr::rename_with(paste0, "_pst", .cols = everything()),
  slingCurveWeights(sds) %>% as.data.frame(),
) %>%
  mutate(Lineage1_pst = if_else(is.na(Lineage1_pst), 0, Lineage1_pst),
         Lineage2_pst = if_else(is.na(Lineage2_pst), 0, Lineage2_pst),
         pst = if_else(Lineage1 > Lineage2, Lineage1_pst, Lineage2_pst),
         pst = max(pst) - pst)

Now, if I plot according to the two lineages:

ggplot(df, aes(UMAP_1, UMAP_2)) +
  geom_point(aes_string(color = df$Lineage1_pst),
             alpha = 0.5) +
  scale_colour_viridis_c() +
  theme_minimal() + labs(colour = "Pseudotime") 

ggplot(df, aes(UMAP_1, UMAP_2)) +
  geom_point(aes_string(color = df$Lineage2_pst),
             alpha = 0.5) +
  scale_colour_viridis_c() +
  theme_minimal() + labs(colour = "Pseudotime") 

I get: lin

And it makes sense. However if I do:

 ggplot(df, aes(x = UMAP_1, y = UMAP_2)) +
  geom_point(size = .7, aes(col = pst)) +
  scale_color_viridis_c() +
  labs(col = "Pseudotime") +
  geom_path(data = curves %>% arrange(Order),
            aes(group = Lineage), col = "black",  arrow = arrow(), lineend = "round", size = 1.5) +
  theme(legend.position = c(.15, .35),
        legend.background = element_blank()) +  theme_minimal() 

The pseudotime seems to be inverted: traj

The pseudotime is "correct" and makes sense biologically if I exclude the last line of your code: pst = max(pst) - pst

Can you help me? I don't understand well what does this line of code do and whether in my case I can safely exclude it.

Thanks and apologize for being long.

Best, Francesco

HectorRDB commented 1 year ago

Hi @france-hub No pb. In that specific setting for the Fibrosis dataset, we don;t have cells diverging from one initial state into two but cells converging from 2 states into one. Slingshot is not made to handle this by default, so I just reversed the pseudotimes.

Note that this should have no impact on the results, since all functions and tests are agnostics to afine transformation of the pseudotimes.

In your case, you can delete that line indeed.

france-hub commented 1 year ago

Thank you very much!