ImmuneDynamics / Spectre

A computational toolkit in R for the integration, exploration, and analysis of high-dimensional single-cell cytometry and imaging data.
https://immunedynamics.github.io/spectre/
MIT License
57 stars 22 forks source link

how to repel labels in make.colour.plot and make sure its not cut off #197

Open hannbann19 opened 2 months ago

hannbann19 commented 2 months ago

hi there, i've been trying to use ggrepel to stop labels from overlapping but unfortunately been getting errors saying there are too many overlaps. i had set max.overlap=lnf, but it didn't generate any plot. is there a way to avoid overlapping of text? thanks in advance!

this has been my code.


`make.colour.plot(tmp_down, "UMAP_X", "UMAP_Y", "Population", col.type = 'factor', plot.width = 35, add.label=TRUE, plot.height =35, dot.size=3,title = "General Leukocyte Panel UMAP") +  
geom_label_repel (data=tmp_down, mapping = aes(label = "Population"))`
SamGG commented 2 months ago

Try to give more room +xlim(c(-15,20))+ylim(c(-15,20)), change pull (>1) or pull_force (could be negative), make label size smaller (size = 1). In the end, this is quite tricky, IMHO.

hannbann19 commented 2 months ago

thanks for the reply, i'm a tad confused regarding pull_force. what does it do? i haven't seen an argument for label size within make.colour.plot

edit: forgot to mention i tried +xlim(c(-15,20))+ylim(c(-15,20)) but it squished everything into the centre and caused more overcrowding

SamGG commented 2 months ago

force_pull is an attraction force to the point, but when negative it acts as a repulsion.

force Force of repulsion between overlapping text labels. Defaults to 1. force_pull Force of attraction between a text label and its corresponding data point. Defaults to 1.

size could be added to geom_label_repl().

xlim: that's why I said it's tricky; try with one of the force parameters.

you can try nudge_x/y parameters: it moves the labels aprt from the points, but does not solve the crowding in my hands.

hannbann19 commented 2 months ago

the overlapping of texts is my main issue. so when i go set geom_label_repel, it says i have too many unlabelled points. i think for some reason instead of taking the centroid, it's taken all my data points (48000) of them. could you guide me on where to go from here? many thanks


> q<- make.colour.plot(tmp_down, "UMAP_X", "UMAP_Y", "Population", col.type = 'factor', plot.width = 35, add.label=TRUE, plot.height =35, dot.size=1,title = "General Leukocyte Panel UMAP") 
> q+geom_text_repel()
Error in `geom_text_repel()`:
! Problem while setting up geom.
ℹ Error occurred in the 4th layer.
Caused by error in `compute_geom_1()`:
! `geom_text_repel()` requires the following missing aesthetics: label
Run `rlang::last_trace()` to see where the error occurred.
> q+geom_text_repel(aes(label="Population"))
Warning message:
ggrepel: 48727 unlabeled data points (too many overlaps). Consider increasing max.overlaps
SamGG commented 2 months ago

The make.colour.plot function plots all the cells. From your initial plot, I thought it plots cluster names only. So the errors you show are attended, we cannot label 50 k points. The code you are trying to mimic is below, and your aim is to replace geom_label() by geom_label_repl(). I think Thomas has already tried this. https://github.com/ImmuneDynamics/Spectre/blob/fc072a92258252130e059fc2221b3f17e8b906fc/R/make.colour.plot.R#L441-L468

So, the code should look like below. Hope this help, but Tom or Givanna will answer better than me.

# run as usually without plotting labels
q <- make.colour.plot(tmp_down, "UMAP_X", "UMAP_Y", "Population", col.type = 'factor', plot.width = 35, add.label=FALSE, plot.height =35, dot.size=1, title = "General Leukocyte Panel UMAP") 
# prepare the hack
dat <- tmp_down
x.axis <- "UMAP_X"
y.axis <- "UMAP_Y"
col.axis <-  "Population"
# execute
labels <- sort(unique(dat[[col.axis]])) 
centroidsDf <- data.frame( 
     centroidX = tapply(dat[[x.axis]], dat[[col.axis]], median), # median 
     centroidY = tapply(dat[[y.axis]], dat[[col.axis]], median), 
     centroidCol = labels) 
 } 
 ## Add labels 
 p <- q + geom_point(data = centroidsDf, aes(x = centroidX, y = centroidY), col = "black", size = 2) 
 p <- p + geom_label(data = centroidsDf, hjust = 0, aes(x = centroidX, y = centroidY, label = centroidCol, alpha = 0.5), col = "black", fontface = "bold") 
 p <- p + guides(alpha = "none") 
print(p)
hannbann19 commented 2 months ago

Thank you so much! That was just what I was looking for! It worked perfectly. One minor detail I had one more question about. Since the centroids labels are plotted separately from the UMAP "population" cluster, there's no way for me to match the colour of the labels to the legend, except manually. Is that correct?