Closed jielab closed 5 months ago
detect_ref.py
. I recommend starting there.conf.symlink_to_cache_dir
to make a symlink at https://github.com/statgen/pheweb/blob/76f0d0e32ae72e51bc4b259ce4b16edfd653601a/pheweb/load/download_rsids.py#L25 instead of copying the file. Maybe send a PR?Thanks! Can you please let me know where I could find detect_ref.py ?
pheweb https://github.com/statgen/pheweb/tree/master/pheweb/load https://github.com/statgen/pheweb/tree/master/pheweb/load/detect_ref.py Or run detect-ref at command line.
On Thu, Jun 8, 2023 at 5:47 AM Jie Huang @.***> wrote:
Thanks! Can you please let me know where I could find detect_ref.py ?
— Reply to this email directly, view it on GitHub https://github.com/statgen/pheweb/issues/208#issuecomment-1582239406, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGSPCOTY74OFXK47X2CR2DXKGNRTANCNFSM6AAAAAAY5D2OSU . You are receiving this because you commented.Message ID: @.***>
Hi, guys:
I really like pheweb. After simply running a few short commands such as pip3 install pheweb and pheweb phenolist glob --star-is-phenocode, then everything works magically on my own laptop, and I could explore those cool tables and figures even when I fly in the sky.
However, I do have a couple of feature-requests/wonderings. It would be super cool if you guys could agree that addressing some of these might be useful to the broad pheweb community.
The Github documentation says that _It needs a column for the reference allele (which must always match the bases on the reference genome that you specified with hg_build_number) and a column for the alternate allele_. I dont' know why this is a must, since nobody is using pheweb to run GWAS meta-analysis or two-sample MR kind of analysis where alignment of alleles are needed! As we know, these days GWAS downloaded from everywhere usually have their own ways of specifying effect/non-effect (or reference/alternative, or A1/A2) alleles. If we indeed must align the alleles in GWAS files to the reference genome, how to do it correctly and effeciently without going through some complicated GATK procedure? The documentation says that If you have a MARKER_ID column like 1:234_C/G, that's okay too. Once I have such a MARKER_ID column in my GWAS files, I still need to split them into sepaate columns of chr, pos, ref, alt, because those are required columns, correct?
I really like the fact that pheweb does NOT require rsid for input GWAS. Instead, it can generate new GWAS files with rsid appended, stored at generated-by-pheweb/pheno_gz/. I am wondering how to run this add_rsid module as a standalone script/command? The log shows that https://resources.pheweb.org/rsids-v154-hg38.tsv.gz is downloaded into my computer when I run pheweb. Is there a way to prevent this file from getting downloaded again and again each time when I run? Or can I specify the path of this file in config.py instead of having it at the default location?
Let's say that I have 100 GWAS that I would like to process and display in pheweb. The positions in some of them are based on hg18 while others are based on hg38. Is there a way to specify the hg_build_number = option twice, one for those with hg18 position and one for those with hg38 position? If no, if I had to liftOver all GWAS to the same hg_build first, is there an easy way to do it? I know that I could use liftOver tool. But these days each GWAS file is usually over 10 million rows...
That's all I got.
Thank you very much & best regards, Jie