Closed swvanderlaan closed 10 months ago
Happy to help create the datasets if you tell me how to exactly do this?
Hi, Thanks for the suggestion! Actually the raw 1KG dataset was processed using the code as described in https://cloufield.github.io/gwaslab/Reference/#1000-genome-projecthg19. You can easily prepare the dataset for your purpose. I will update the datasets for the population you mentiened soon (previous the datasets were hosted on Dropbox with limited storage and recently we upgraded the storage).
Thanks. I am making those files now, as per instruction. Does take a while... :-)
Ok. I got them all, except for the PAN
/ALL
dataset, i.e. the one including all 1000G variants to use as a reference when doing a trans-ancestry analysis. It takes up a lot of intermediate/temporary space for the last step (merging) which I don't have at the moment. Did you happen to upload that one, by any chance?
Hi, Sorry for the late reply. I have updated the reference datasets including all populations and PAN datasets. (Indeed it took very long and a lot of space to merge the datasets...) Since the links for Dropbox have changed, I have also updated the parsing. To download the new datasets, please update to v3.4.24.
Please note that the PAN datasets are large (>10GB)
Oh wow! Thanks a lot!
Is it possible to also add the other ancestral groups to the download?
SAS
AFR
AMR
ALL
orPAN
for all the populations (handy for multi-ancestral populations GWAS) and/or a function that would enable to combine a given set of downloaded references (e.g.EUR
andEAS
) into one reference for GWAS with a specific number of ancestral groups