DillonHammill / CytoExploreR

Interactive Cytometry Data Analysis
61 stars 13 forks source link

FEATURE DEMO: Colour Points Based Expression of Third Parameter #64

Closed DillonHammill closed 3 years ago

DillonHammill commented 4 years ago

CytoExploreR v1.0.8 comes with the ability to colour points in a 2-D scatter plot based on expression of a third parameter. This feature is commonly used for identifying well separated populations in a dimension reduced space, but with some addition modifications it can be a powerful tool to locate populations in any 2D space without any prior gating. For this demonstration we will use the ungated Activation dataset, below are the steps required to prepare the data:

# Required packages
library(CytoExploreRData)
library(CytoExploreR)

# Activation GatingSet 
gs <- GatingSet(Activation)

# Compensation
gs <- cyto_compensate(gs)

# Transformations
gs <- cyto_transform(gs)

cyto_plot() The first thing to note is that the channels have all been appropriately transformed, this is important as it will affect the resulting colour gradient for the third parameter. Alright, let's say that we wanted to ensure that our initial FSC-A/SSC-A gate encompasses all the the T Cells (Va2+ Cells). Normally we would look for a nicely formed lymphocyte population in the FSC-A/SSC-A without even considering marker expression. Instead, now we can tell cyto_plot() to colour the points based on expression of Va2 so that we can see where this population lies in the FSC-A/SSC-A parameters. To do this, we simply supply the name of the marker/channel to the point_col argument:

cyto_plot(gs[1:4],
          parent = "root",
          channels = c("FSC-A", "SSC-A"),
          point_col = "Va2")

forward-gating

cyto_gate_draw() Pretty cool right? Now it is clear where our T Cells are as they formed a nicely defined cluster. We can also see our Dead Cells to the left which form a broader population to the left. As you can see cyto_plot() plot events with highest expression last, so that it is easy to locate these events. Since cyto_gate_draw() accepts all of cyto_plot() arguments we can use this information when we gate:

cyto_gate_draw(gs,
               parent = "root",
               alias = "Cells",
               channels = c("FSC-A", "SSC-A"),
               point_col = "Va2",
               gatingTemplate = "Activation-gatingTemplate.csv")

forward_gating

Caveats The colour scale is dependent on the total range of the data supplied to cyto_plot() in the indicated channel. The lower end of the scale is assigned the first colour in point_col_scale and the upper end is assigned the last colour in point_col_scale. This will work perfectly if you supply data that contains both negative and positive events in every parameter, but if you don't the colour scale will look weird. For example, if we plot the CD8 T Cells and try to colour the points based on CD8 expression, the lowest expression in that population will be coloured blue and the highest coloured red. This will make it look like you have some CD8- CD8 T Cells - which is obviously not the case. Similarly, if we plot samples without OVA antigen and try to colour the points based on CD69 expression it will look like we have some CD69+ cells.

cyto_calibrate() To overcome this issue I have written the cyto_calibrate() function which can be used prior to plotting to tell cyto_plot() the full range of values for each parameter. All you need to do is supply enough samples to encompass the full range for each parameter, supplying all the samples is overkill but we could just supply an unactivated and activated sample. cyto_calibrate() will compute the axes ranges for each parameter and this information will be used in any subsequent cyto_plot() calls to ensure accurate colour scales when only a subset of the samples are supplied. cyto_calibrate() will compute the full ranges by default but can also compute lower and upper quantiles to exclude outliers by setting type = quantile.

cyto_calibrate(gs,
               parent = "root",
               type = "quantile")

cyto_calibrate_reset() If you need to adjust the calibration setting you can simply make a new call to cyto_calibrate() or reset the calibration settings by calling cyto_calibrate_reset().

# Reset calibration settings
cyto_calibrate_reset()
gringer commented 4 years ago

Can you make this default to a viridis colour map? Red/green maps are not so accessible to colourblind people

https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html

DillonHammill commented 4 years ago

@gringer, the point_col_scale was something that was set very early on the development process (even before CytoRSuite existed) but was purely for historical reasons. This is unfortunately the default colour scale for a variety of widely adopted software for cytometry data analysis (including flowJo, cytobank and FCS Express). Fortunately it is easy to change the colour palette through the point_col_scale argument:

library(viridis)
cyto_plot(gs[1:4],
          parent = "root",
          channels = c("FSC-A", "SSC-A"),
          point_col = "Va2",
          point_col_scale = viridis(10))

viridis

You can also set your own custom defaults for cyto_plot() using cyto_plot_theme(). For example, if you want to use the viridis colour palette for all plotting in CytoExploreR but don't want to have to type it out every time:

cyto_plot_theme(point_col_scale = viridis(10))
cyto_plot(gs[1:4],
          parent = "root",
          channels = c("FSC-A", "SSC-A"))

viridis-2

Any call to cyto_plot() after the cyto_plot_theme() call will inherit your custom theme. Take a look at cyto_plot_theme_args() to see which other arguments are supported for custom themes.

The viridis colour palette does display 2D density nicely but I think it lacks enough resolution when the points are added in layered fashion. This where some extra colours come in handy. I will play around with some colour palettes to see if I can find something that works well for both purposes.

DillonHammill commented 4 years ago

Not sure if this defeats the purpose of coming up with a more visually appealing colour palette, but here is a nice palette that includes a mixture of the viridis palettes which performs really well in all of my testing:

# Viridis palettes
viridis_pal <- viridis(10, option = "D")
plasma_pal <- viridis(10, option = "C")

# Remove overlapping colours
viridis_pal <- viridis_pal[-10]
plasma_pal <- plasma_pal[-c(1:4)]

# Combine palettes
custom_pal <- c(viridis_pal, rev(plasma_pal))

# Set custom theme
cyto_plot_theme(point_col_scale = custom_pal)

image

2D density

# 2D density
cyto_plot(gs[1:4],
          parent = "root",
          channels = c("FSC-A", "SSC-A"))

viridis-3

# 2D density
cyto_plot(gs[1:4],
          parent = "T Cells",
          channels = c("CD4", "CD8"))

viridis-4

Expression level These plots were generated without proper calibration, so it is likely that we can achieve even better resolution.

# Expression level
cyto_plot(gs[1:4],
          parent = "root",
          channels = c("FSC-A", "SSC-A"),
          point_col = "Va2")

viridis-5

# Expression level
cyto_plot(gs[1:4],
          parent = "root",
          channels = c("FSC-A", "SSC-A"),
          point_col = "CD8")

viridis-6

If it does improve visualization of the data for users I would be happy to set this colour palette as the new default for cyto_plot() in CytoExploreR v1.0.9 if there is sufficient interest.

DillonHammill commented 3 years ago

I have made some major improvements to this feature in CytoExploreR version 2.0.0 (coming soon).

You can now run cyto_calibrate() on a per channel basis to set different calibration settings for each channel - these will automatically be inherited by cyto_plot() when required.

I also added a new key to display the colour scale annotated with the relevant marker. Here is a sneak peek of this (awesome!) feature: image

DillonHammill commented 3 years ago

Adding a comment about improved support for the above features in CytoExploreR v2.0.0.

cyto_calibrate() and cyto_calibrate_reset() have been renamed to cyto_plot_calibrate() and cyto_plot_calibrate_reset() respectively to join the cyto_plot family of functions.

cyto_plot_calibrate() also receives some substantial updates:

simonuoft commented 2 years ago

Hi, the point_col function is awesome especially to check level of marker expression in differents clusters after a dimensional reduction. I was wondering what is the function to display the legend aka the color gradient with value to have an idea of the level of expression in term of absolute values. It look like it's feasible as you did it on the plot above but can't find the function. Thanks in advance.

DillonHammill commented 2 years ago

@simonuoft, this feature has been added to version 2.0.0 of CytoExploreR which will be released soon. The scale is turned on by default so you don't need to fiddle with any cyto_plot() arguments - unless you want to remove it. Here is a preview: image

simonuoft commented 2 years ago

🤯, that look awesome. Can't wait for the release of the version 2.0.0 to make beautiful plot with my CYTOF datas^^. Thanks for the quick reply.