Cloufield / gwaslab

A Python package for handling and visualizing GWAS summary statistics. https://cloufield.github.io/gwaslab/
GNU General Public License v3.0
151 stars 25 forks source link

Regional plot: unable to plot variants correctly. #76

Closed sakuramodokich closed 7 months ago

sakuramodokich commented 7 months ago

Hi I have an issue while trying to make a regional plot with my own LD vcf file, and the variants seem to have been incorrectly plotted in the bottom-left corner: mysumstats.plot_mqq(mode="r", build="38", region=(7,156538803,157538803), vcf_path=my_ld_vcf, save="locuszoom_chr7.pdf") image

The log:

Tue Feb  6 17:25:13 2024 Start to plot manhattan/qq plot with the following basic settings:
Tue Feb  6 17:25:13 2024  -Genomic coordinates version: 38...
Tue Feb  6 17:25:13 2024  -Genome-wide significance level is set to 5e-08 ...
Tue Feb  6 17:25:13 2024  -Raw input contains xxxx variants...
Tue Feb  6 17:25:13 2024  -Plot layout mode is : r
Tue Feb  6 17:25:13 2024  -Region to plot : chr7:156538803-157538803.
Tue Feb  6 17:25:13 2024  -Checking prefix for chromosomes in vcf files...
Tue Feb  6 17:25:13 2024  -No prefix for chromosomes in the VCF files.
Tue Feb  6 17:25:15 2024  -Extract SNPs in region : chr7:156538803-157538803...
Tue Feb  6 17:25:16 2024  -Extract SNPs in specified regions: xxxx
Tue Feb  6 17:25:16 2024 Finished loading specified columns from the sumstats.
Tue Feb  6 17:25:16 2024 Start conversion and sanity check:
Tue Feb  6 17:25:16 2024  -Removed 0 variants with nan in CHR or POS column ...
Tue Feb  6 17:25:16 2024  -Removed 0 variants with CHR <=0...
Tue Feb  6 17:25:16 2024  -Removed 0 variants with nan in P column ...
Tue Feb  6 17:25:16 2024  -Sanity check after conversion: 0 variants with P value outside of (0,1] will be removed...
Tue Feb  6 17:25:16 2024  -Sumstats P values are being converted to -log10(P)...
Tue Feb  6 17:25:16 2024  -Sanity check: 0 na/inf/-inf variants will be removed...
Tue Feb  6 17:25:16 2024  -Maximum -log10(P) values is 2.988823010007124 .
Tue Feb  6 17:25:16 2024 Finished data conversion and sanity check.
Tue Feb  6 17:25:16 2024 Start to load reference genotype...
Tue Feb  6 17:25:16 2024  -reference vcf path : /path/to/my_vcf_file
Tue Feb  6 18:28:16 2024  -Retrieving index...
Tue Feb  6 18:28:16 2024  -Ref variants in the region: 3624
Tue Feb  6 18:28:16 2024  -Matching variants using POS, NEA, EA ...
Tue Feb  6 18:28:16 2024  -Lead SNP not found in reference...
Tue Feb  6 18:28:16 2024 Finished loading reference genotype successfully!
Tue Feb  6 18:28:16 2024 Start to create manhattan plot with xxx variants:
Tue Feb  6 18:28:16 2024  -Extracting lead variant...
Tue Feb  6 18:28:16 2024  -Loading gtf files from:default
INFO:root:Extracted GTF attributes: ['gene_id', 'gene_name', 'gene_biotype']
Tue Feb  6 18:29:15 2024  -plotting gene track..
Tue Feb  6 18:29:15 2024  -Finished plotting gene track..
Tue Feb  6 18:29:16 2024 Finished creating Manhattan plot successfully!
Tue Feb  6 18:29:16 2024  -Skip annotating
Tue Feb  6 18:29:16 2024 Saving plot:
Tue Feb  6 18:29:16 2024  -Saved to locuszoom_chr7.pdf successfully!

The LD vcf file was converted from unphased bed file: plink2 --bfile my_bed_file --recode vcf --out my_vcf_file

Cloufield commented 7 months ago

Hi, Would you please try again with updated version of gwaslab (v3.4.38) or just save the plot as png? The problem seems to only occur when saving as pdf in old versions of gwaslab.

Cloufield commented 7 months ago

By the way, you vcf works fine here.

Cloufield commented 7 months ago

Sometimes it is due to incompatible dpi, adding figargs={"dpi":72} in .plot_mqq() might work.

sakuramodokich commented 7 months ago

Save the plot to PNG solved the problem. Thank you!

sakuramodokich commented 7 months ago

By the way, I have already updated my gwaslab to the latest version and set dpi to 72. But the issue still exists, and the only solution is to save it as png.

Cloufield commented 7 months ago

It might be related to matplotlib version. I am wondering which version of matplotlib you are using now?

sakuramodokich commented 7 months ago

I am using matplotlib 3.8.0 in Python 3.9.17

Cloufield commented 7 months ago

Thanks! I replicated the error with matplotlib 3.8.2. It seems that the problem occured since matplotlib v3.7. And v3.6 works well. For now, if you want to save as pdf, you can downgrade matplotlib to 3.6.3. I will look into the error soon.