Closed Shicheng-Guo closed 1 year ago
I haven't thought of any good way to do it. Do you have a suggestion?
Okay. I find a good solution for my own case. All my summary statistics is from UKB-GWS dataset, therefore, I can create my own sites-unannotated.tsv
, rather than generated by pheweb sites
since pheweb sites
will loop 2000+ summary statistics (WGS-GWAS) files which takes quite long time.
I am wondering what will happen if my own sites-unannotated.tsv
having some extra SNPs since REGENIE will ignore/remove some SNPs if MAF is lower than threshold.
Shicheng
Interesting. looks like all the files in /generated-by-pheweb/sites
are binary files which is not in old pheweb version.
-rw-r--r-- 1 sguo2 jan100 104K Sep 6 21:39 cpras-rsids.sqlite3
-rw-r--r-- 1 sguo2 jan100 18K Sep 6 21:39 sites-rsids.tsv
-rw-r--r-- 1 sguo2 jan100 13K Sep 6 21:39 sites-unannotated.tsv
-rw-r--r-- 1 sguo2 jan100 19K Sep 6 21:39 sites.tsv
In some old pheweb version, sites-unannotated.tsv
is txt file which I think we can prepare by our own script.
We can also try to move this step into
slurm
cluster to make this faster.