Cloufield / gwaslab

A Python package for handling and visualizing GWAS summary statistics. https://cloufield.github.io/gwaslab/
GNU General Public License v3.0
119 stars 22 forks source link

gl.download_ref(), not clear how to use it #10

Open Delvalle-beep opened 1 year ago

Delvalle-beep commented 1 year ago

Hello! All is well? Could you explain me better how to use the reference file to annotate genes "GENENAME", please? Is it the same method for regional plot and manhattan? From the documentation I didn't think it was very clear, I tried to download the file and put anno='GENENAME', but it didn't work

Cloufield commented 1 year ago

Hi, if it is just for annotating "GENENAME", gwaslab will automatically download references (gtf) and process the files when you run it for the first time.

Screenshot 2023-03-30 at 14 37 53

For regional plots, you usually need a vcf file for LD calculation. You can check the available refernces like:

image

Then use "gl.download_ref()" to download the vcf you need like gl.download_ref("1kg_eas_hg19"):

image

Finnally, create the regional plot: Regional plots and manhattan plots use the same files for annotate GENENAME. The path for downloaded file can be obtained using gl.get_path("1kg_eas_hg19")

mysumstats.plot_mqq(skip=2,
                    mode="r",
                    region=(7,126253550,128253550),
                    anno="GENENAME",
                    vcf_path=gl.get_path("1kg_eas_hg19"))
image

Please let me know if you have more questions.

Delvalle-beep commented 1 year ago

image When I put "GENENAME" the name of the gene simply does not appear

Delvalle-beep commented 1 year ago

image

Cloufield commented 1 year ago

Hi, It seems that there were no errors. Could you please show the command you used and also the log? These will help me pinpoint the issue here...

Delvalle-beep commented 1 year ago

Just to finish the problem, I managed to solve it! I'm trying to automate plot generation for my study using your script (: And for some reason I couldn't assign the name of the gene due to the script automatically treating this variable as a boolean