neurogenomics / MAGMA_Celltyping

Find causal cell-types underlying complex trait genetics
https://neurogenomics.github.io/MAGMA_Celltyping
71 stars 31 forks source link

get_genomeLocFile error #100

Closed malosreet closed 2 years ago

malosreet commented 2 years ago

Hello!

I am just getting started with using MAGMA_Celltyping and not familiar with GWAS related analysis. I am running into an error using map_snps_to_genes.

First I used import_sumstats from MungeSumstats to get appropriately formatted GWAS summary statistics from several studies on Open GWAS. Then I am trying to run map_snps_to_genes but I get the following error on step 1 of the MAGMA analysis:

ERROR - reading gene location file: too few values on line 1:
        line: {

When I look inside the NCBI37.3.gene.loc file, this is what I see:

{
  "message": "Bad credentials",
  "documentation_url": "https://docs.github.com/rest"
}

I think that this is happening because the get_genomeLocFile function is calling get_data, which is calling pb_download to get the required file from the MAGMA_Celltyping repository. But, there is an authentication error and the download fails.

Perhaps there is something simple I can change in my set-up to fix the problem.

Any advice is appreciated!

Malosree

bschilder commented 2 years ago

Thanks for letting me know about this @malosreet. Can you confirm that you have a stable internet connection? It's possible that there was a connection problem the first time you tried to download the file, and that malformed file was detected and used in subsequent runs. If you restart your R sessions (which will empty your tempdir()) does the problem persist?

That said, it's odd that it would write a file with the error message as the document (I'd expect it to simply be empty or not there at all if there was a problem). I'll try adding some checks to ensure the file contains what I expect it to contain (something like this):

Screenshot 2022-02-10 at 15 18 46
bschilder commented 2 years ago

I've just pushed some additional checks validating the input (build name) and output (genomeLocFile which should be a table), as well as unit tests. Hoping this might help pinpoint the issue.

malosreet commented 2 years ago

Hi @bschilder,

Sorry about the duplicate issue, I think I accidentally posted twice.

I did check that I have a stable internet connection by running the following:

> curl::has_internet()
[1] TRUE

However, when I tried the following:

pb_download("iris2.tsv.gz", 
            repo = "cboettig/piggyback-tests",
            tag = "v0.0.1",
            dest = tempdir())

I got the same error file in my temporary directory. So it must have something to do with configuration at my end. Maybe it has to do with a firewall setting. I am not sure.

Thanks for looking into it!

bschilder commented 2 years ago

Thanks for the extra info!

Ah i see, I've not encountered that one before. Perhaps if you bring it up with the piggyback developers they might have an idea as to what's going on there. Do let me know if the solution involves anything I can implement on my end though!

Al-Murphy commented 2 years ago

This has come up with other users and seems to be a piggyback download issue for windows users, @malosreet what OS are you using? A work around is to get a version of the genomeLocFile manually and paste it to the working directory for MAGMA. @bschilder is there details of where to download these files from for @malosreet

bschilder commented 2 years ago

Good to know, @Al-Murphy! All the files are stored as Assets under Releases here:

https://github.com/neurogenomics/MAGMA_Celltyping/releases

Screenshot 2022-02-18 at 09 04 28