Cloufield / gwaslab

A Python package for handling and visualizing GWAS summary statistics. https://cloufield.github.io/gwaslab/
GNU General Public License v3.0
151 stars 25 forks source link

About adding highlight label in manhattan plot #1

Open hmutanqilong opened 2 years ago

hmutanqilong commented 2 years ago

Hi! This project is awesome, and I always get genetics knowledge from your blog or Zhihu website. Recently, I try to use this new tool to deal with summary data of GWAS and plot some pics. So I have an advice, could you please add a parameter that can custom highlight regions with labels? Because we sometimes highlight GENEs nor SNPs in the manhattan plot.

Cloufield commented 2 years ago

Hi, hmutanqilong! Indeed, as you pointed out, this parameter can be useful in some cases. I think I will add an option for highlighting using either certain values in a pre-annotated column like (["Gene1","Gene2"] in "Gene" column), or bed files (chr strat end), or maybe a list of tuples which defines the gene region like [(chr1,123,234),(chr2,123,345)]. Actually I have implemented a lot of updates recently (but I haven't update the documents yet.) I will think about this and implement your suggestion soon. Thank you so much for your advice!

hmutanqilong commented 2 years ago

Also like the way of key-value, like ["SNP1": "GENE1", "SNP2": "GENE2"]

Cloufield commented 2 years ago

Also like the way of key-value, like ["SNP1": "GENE1", "SNP2": "GENE2"] Oh, I see. You mean highlight a region around "SNP1" and then annotate this region with "GENE1"?

hmutanqilong commented 2 years ago

Yeah, sometimes we need to label the nearest genes in the lead SNPs

EDISON2022W commented 1 year ago

Hi, may I know which option is for the gene annotation, please? I need to label the nearest genes associated with the significant SNPs. Thanks!

Cloufield commented 1 year ago

Hi, Thanks for your question. I have already implemented this function (by integrating pyensembl) so that gwaslab can automatically annotate the nearest gene! And probably, I will update the pip package tomorrow and also the documents. I will let you know when I finished.

EDISON2022W commented 1 year ago

Thanks and looking forward to it.

Cloufield commented 1 year ago

Hi, guys. You can now install the lastest version of gwaslab pip install gwaslab==3.3.0. And try annotate variants with Gene names or anything you want. Please see https://github.com/Cloufield/gwaslab/blob/main/examples/plot_mqq.ipynb . You can check the part for anno="GENENAME" and anno_alias (If you use it for the first time, pyensemble might download some cache data first). I hope this update meet your needs. I will write a more detailed version of documents soon. Thanks!

image
EDISON2022W commented 1 year ago

Thanks and will try it.

EDISON2022W commented 1 year ago

Got an error when fetching and parsing ensembl_hg38_gtf

error: pyo3_runtime.PanicException: Unwrapped panic from Python code

Can I provide the gene list for the plotting?

EDISON2022W commented 1 year ago

Just found anno_set and anno_alias work well for the plotting.

Cloufield commented 1 year ago

Got an error when fetching and parsing ensembl_hg38_gtf

error: pyo3_runtime.PanicException: Unwrapped panic from Python code

Can I provide the gene list for the plotting?

Hi, there

Just found this might be an issue related to pyensembl. https://github.com/openvax/pyensembl/issues/279#issue-1568341249 You could try if pip install gtfparse==1.3.0 could fix the error. (It fixed the error on my device)

I will check the reason for this. Thanks.

iskandr commented 1 year ago

The latest PyEnsembl (2.2.5) should also fix this -- sorry for introducing the error in the first place.

Cloufield commented 1 year ago

Thanks a lot for the updates! I will revise the requirements.