mglev1n / locusplotr

Create Regional Association Plots
https://mglev1n.github.io/locusplotr
Other
10 stars 0 forks source link

How to add the SNP ID in the plot? #7

Open Huiflorazhan opened 3 weeks ago

Huiflorazhan commented 3 weeks ago

Hi, the ref SNP ID is not shown in the plot, could you please help me with it? Thank you for your time! image

mglev1n commented 3 weeks ago

Please provide a reproducible example

Huiflorazhan commented 3 weeks ago

Please provide a reproducible example

library(locusplotr)
library(tidyverse)
library(ggplot2)
df1 <- read_tsv("../test_cis.tsv.gz")
p1 <- gg_locusplot(
    df1,
    lead_snp = "rs1902708",
    rsid = rsid,
    chrom = CHROM,
    pos = GENPOS,
    ref = REF,
    alt = ALT,
    p_value = P,
    genome_build = "GRCh38",
    population = "EUR",
    plot_genes = FALSE,
    plot_recombination = TRUE
    )
ggsave(plot=p1, file="../p1.png", width = 10, height = 5, dpi = [600)]

Test file: test_cis.tsv.gz

mglev1n commented 3 weeks ago

Running your code gives a ggrepel warning:

ℹ Extracting LD for 10:96244776_C/T for the region 10:95994841-96494705
ℹ Extracting recombination rates for the region 10:95994776-96494776
Warning message:
Removed 1 row containing missing values or values outside the scale range (`geom_label_repel()`). 

When running the code without plotting recombination rates, the lead SNP is labeled correctly:

vroom::vroom("test_cis.tsv.gz") %>%
  locusplotr::gg_locusplot(
    lead_snp = "rs1902708",
    rsid = rsid,
    chrom = CHROM,
    pos = GENPOS,
    ref = REF,
    alt = ALT,
    p_value = P,
    genome_build = "GRCh38",
    population = "EUR",
    plot_genes = FALSE,
    plot_recombination = FALSE
  )

image

I think this is likely due to an issue with the secondary axis. A temporary workaround if you need to show the recombination rate would be manually adjusting the y-axis limit so that it doesn't cut off the variant label. Something like this:

vroom::vroom("test_cis.tsv.gz") %>%
  locusplotr::gg_locusplot(
    lead_snp = "rs1902708",
    rsid = rsid,
    chrom = CHROM,
    pos = GENPOS,
    ref = REF,
    alt = ALT,
    p_value = P,
    genome_build = "GRCh38",
    population = "EUR",
    plot_genes = FALSE,
    plot_recombination = TRUE
  ) +
  scale_y_continuous(limits = c(0, 20), sec.axis = sec_axis(~ . / 20 * 100, name = "Recombination rate (cM/Mb)"))

image

I will do some additional debugging to see how this could be resolved more durably.

Huiflorazhan commented 3 weeks ago

Running your code gives a ggrepel warning:

ℹ Extracting LD for 10:96244776_C/T for the region 10:95994841-96494705
ℹ Extracting recombination rates for the region 10:95994776-96494776
Warning message:
Removed 1 row containing missing values or values outside the scale range (`geom_label_repel()`). 

When running the code without plotting recombination rates, the lead SNP is labeled correctly:

vroom::vroom("test_cis.tsv.gz") %>%
  locusplotr::gg_locusplot(
    lead_snp = "rs1902708",
    rsid = rsid,
    chrom = CHROM,
    pos = GENPOS,
    ref = REF,
    alt = ALT,
    p_value = P,
    genome_build = "GRCh38",
    population = "EUR",
    plot_genes = FALSE,
    plot_recombination = FALSE
  )

image

I think this is likely due to an issue with the secondary axis. A temporary workaround if you need to show the recombination rate would be manually adjusting the y-axis limit so that it doesn't cut off the variant label. Something like this:

vroom::vroom("test_cis.tsv.gz") %>%
  locusplotr::gg_locusplot(
    lead_snp = "rs1902708",
    rsid = rsid,
    chrom = CHROM,
    pos = GENPOS,
    ref = REF,
    alt = ALT,
    p_value = P,
    genome_build = "GRCh38",
    population = "EUR",
    plot_genes = FALSE,
    plot_recombination = TRUE
  ) +
  scale_y_continuous(limits = c(0, 20), sec.axis = sec_axis(~ . / 20 * 100, name = "Recombination rate (cM/Mb)"))

image

I will do some additional debugging to see how this could be resolved more durably.

Thank you for your time! Appreciate!