KasperSkytte / ampvis2

Tools for visualising microbial community amplicon data
https://kasperskytte.github.io/ampvis2/
GNU General Public License v3.0
66 stars 23 forks source link

Ampvis_rarecurve overlap label #101

Closed dikiprawisuda closed 3 years ago

dikiprawisuda commented 3 years ago

Dear developers,

I have successfully made rarecurve from my data. However when I added labels with ggrepel::geom_label_repel command, the return with the label all over the background. I think it is because of the repeated labeling process of each objec[[data]]. Is there a better way to label each line of rarecurve? Because the provided legend is not clear enough.

Thank you for all the help.

KasperSkytte commented 3 years ago

Hi there

Please provide a reproducible example.

dikiprawisuda commented 3 years ago

Hi! I finally succeeded in making reprex(), had numerous error in the way. So I want to label each individual line per SampleID something like this image

So I add the following code

Instead, it returns this reprex_ampvis

Below is the reprex, let me know if it is incomplete as it seemed to be unintentionally interrupted.

library(reprex)
library(ggplot2)
library(ggrepel)
library(ampvis2)

data("AalborgWWTPs")
amp_rarecurve(AalborgWWTPs, facet_by = "Plant") + 
  geom_label_repel(aes(label=SampleID))

Created on 2020-08-28 by the reprex package (v0.3.0)

KasperSkytte commented 3 years ago

Hi there

Thank you for the reprex, much easier for me to help you. The reason for the many labels is that the lines are made up of multiple steps for each stepsize reads, which is 1000 by default. So you would get 40 labels for the same line if the particular sample had 40.000 reads. So a simple solution could be to only keep the last row in the data for each sample, here done with data.table. I guess you can take it from here to adjust the position of the labels, you probably only need the y position so you can just set the x aesthetic to Inf to position the labels to the far right.

library(ampvis2)
#> Loading required package: ggplot2
library(magrittr)
library(data.table)

#generate a rarecurve plot
plot <- AalborgWWTPs %>% 
  amp_subset_samples(Year == 2014 & Period == "Summer") %>% 
  amp_rarecurve(facet_by = "Plant")
#> 64 samples and 5655 OTUs have been filtered 
#> Before: 67 samples and 9430 OTUs
#> After: 3 samples and 3775 OTUs

#add labels by extracting last row from data (same as the last point on x) for each group (SampleID)
#to avoid multiple labels for each point on each line (default one per 1000 reads)
head(plot$data)
#>      SampleID        Plant       Date Year Period   Species Reads
#> 1 16SAMP-3913 Aalborg East 2014-07-03 2014 Summer    1.0000     1
#> 2 16SAMP-3913 Aalborg East 2014-07-03 2014 Summer  434.6421  1001
#> 3 16SAMP-3913 Aalborg East 2014-07-03 2014 Summer  660.1347  2001
#> 4 16SAMP-3913 Aalborg East 2014-07-03 2014 Summer  824.2978  3001
#> 5 16SAMP-3913 Aalborg East 2014-07-03 2014 Summer  955.7880  4001
#> 6 16SAMP-3913 Aalborg East 2014-07-03 2014 Summer 1066.5047  5001
labelsData <- data.table(plot$data)[,.SD[.N],by = SampleID]
labelsData
#>       SampleID        Plant       Date Year Period Species Reads
#> 1: 16SAMP-3913 Aalborg East 2014-07-03 2014 Summer    1976 21472
#> 2: 16SAMP-3941 Aalborg West 2014-08-18 2014 Summer    2017 22009
#> 3: 16SAMP-4603 Aalborg East 2014-08-19 2014 Summer    2325 26929
plot + 
  ggrepel::geom_label_repel(data = labelsData,
             aes(label = SampleID,
                 x = Inf))

Created on 2020-09-01 by the reprex package (v0.3.0)

dikiprawisuda commented 3 years ago

Hi, thank you it works! For my data though I need to exclude x=Inf because it makes the label attach to invisible nodes of right-side of facet frame box. Other than that it works!

Thank you very much!