jean997 / cause

R package for CAUSE
https://jean997.github.io/cause/
53 stars 15 forks source link

LD pruning #36

Closed giuliapontali closed 1 year ago

giuliapontali commented 1 year ago

Hi, I am running LD pruning:

X_clump <- X %>%
  rename(rsid = snp,
         pval = p1) %>% 
  ieugwasr::ld_clump(dat = .,
                     clump_r2 = r2_thresh,
                     clump_p = pval_thresh,
                     plink_bin = genetics.binaRies::get_plink_binary(), 
                     pop = "EUR")

This command never ends even if I get the following message:

API: public: http://gwas-api.mrcieu.ac.uk/
Please look at vignettes for options on running this locally if you need to run many instances of this command.
Clumping 1EyzIn, 2084557 variants, using EUR population reference
Server code: 503; Server is possibly experiencing traffic, trying again...
Server code: 503; Server is possibly experiencing traffic, trying again...
Server code: 503; Server is possibly experiencing traffic, trying again...
Server code: 503; Server is possibly experiencing traffic, trying again...
Server code: 503; Server is possibly experiencing traffic, trying again...
Server code: 503; Server is possibly experiencing traffic, trying again...
Server error: 503
Failed to retrieve results from server. See error status message in the returned object and contact the developers if the problem persists.
Removing 2084557 of 2084557 variants due to LD with other variants or absence from LD reference panel

How can I handle this? Thank you

jean997 commented 1 year ago

This means that the IEU server is busy so the function is not able to retrieve the LD reference information. You can download the LD reference to your computer and use it locally. See here for instructions: https://mrcieu.github.io/ieugwasr/articles/local_ld.html

giuliapontali commented 1 year ago

Hi!

when I'm running ld_clump there's a problem:

Error in file(file, "rt") : cannot open the connection In addition: Warning message: In file(file, "rt") : cannot open file '\AppData\Local\Temp\RtmpAhgtu7\file27ec20805016.clumped': No such file or directory

It seems that the ".clumped" file cannot be written on my computer. Is there any way to solve it?

I take this opportunity to ask you if the CAUSE package is up to date. If I follow the tutorial, the function gwas_merge does not accept pval_cols.

Thank you

jean997 commented 1 year ago

Were there any other messages before the error message you posted? Can you share the command you used that produced this error?

For pval_cols, this option was introduced in version 1.2.0.0320. Can you check your sessionInfo() and confirm you are using the latest version?

giuliapontali commented 1 year ago

The command that I used is:

## Running local LD operation
a <- tophits(id="ieu-a-2", clump=0)
b <- ld_clump(
  dplyr::tibble(rsid=a$rsid, pval=a$p, id=a$id)
) 
genetics.binaRies::get_plink_binary()

colnames(X)[1]=c("rsid")
colnames(X)[8]=c("pval")

ld_clump(dat=X, clump_r2=0.01, clump_p =1e-3, pop="EUR",
  plink_bin = genetics.binaRies::get_plink_binary(), #This does not works ???? WHY
  bfile = "C:\\Users\\Downloads\\1kg.v3\\1kg.v3\\EUR"
)

And the output is the following:

Clumping 2po6kc, 2043326 variants, using EUR population reference
PLINK v1.90b7 64-bit (16 Jan 2023)             www.cog-genomics.org/plink/1.9/
(C) 2005-2023 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to C:\Users\GPontali\AppData\Local\Temp\RtmpAhgtu7\file27ec53c76a89.log.
Options in effect:
  --bfile C:\Users\Downloads\1kg.v3\1kg.v3\EUR
  --clump C:\Users\AppData\Local\Temp\RtmpAhgtu7\file27ec53c76a89
  --clump-kb 10000
  --clump-p1 0.001
  --clump-r2 0.01
  --out C:\Users\AppData\Local\Temp\RtmpAhgtu7\file27ec53c76a89

32448 MB RAM detected; reserving 16224 MB for main workspace.
8550156 variants loaded from .bim file.
503 people (0 males, 0 females, 503 ambiguous) loaded from .fam.
Ambiguous sex IDs written to
C:\Users\AppData\Local\Temp\RtmpAhgtu7\file27ec53c76a89.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 503 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is in [0.9999995, 1).
8550156 variants and 503 people pass filters and QC.
Note: No phenotypes present.
Warning: No significant --clump results.  Skipping.
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'C:\Users\AppData\Local\Temp\RtmpAhgtu7\file27ec53c76a89.clumped': No such file or directory
jean997 commented 1 year ago

It looks to me like the file isn't there because of the line

Warning: No significant --clump results.  Skipping.

Did you check that there are some variants with p < 1e-3 in the data? I can't run your code because I don't have your X dataframe but that is what I would start with checking.

giuliapontali commented 1 year ago

Yes, I have 10 697 snps with p < 1e-3. They also are coming from different chromosomes, I will expect to have some snps.

jean997 commented 1 year ago

If you share a subset of your dataframe that you know has some significant variants in it I can take a look and try to find the problem. This is really an issue from plink via the ieugwasr package and not related to cause at all so you might have better luck looking at those resources but I will help if I can.

giuliapontali commented 1 year ago

I am going to share some X variants: test.txt Another person had the same problem but the ieugwasr authors never reply.

Thanks a lot for your help

jean997 commented 1 year ago

I think the problem is that the SNPs in your data are not in rsids but chrom-pos format. According to the documentation, ld_clump requires rsids.

giuliapontali commented 1 year ago

Thanks a lot. I will get rsid using Dintor tool. My GWAS had only the chr-pos info.

yaunlinmtemp commented 9 months ago

I solved this same problem. Maybe it's occurs because of your datasets colnames were not "rsid" and "pval", just rename your colnames to these two formats, and the clumping process will working.

image