Open jielab opened 3 days ago
How did you originally create this "pfile chrXY"?
I originally used gfetch to download the UKB raw genotype data and then used plink2 --bed --bim --fam --make-bed to generate the ChrXY data.
Best regards, Jie
Does the original .bim file have chrY or chromosome numbered 24? Both represent chromosome Y.
Please see the screenshot below.
The gentyped data only has 1357 variants and all are labelled XY. The imputed data has 45907 variants. All except 1 are labelled XY.
Best regards, Jie
I don't have access to UKBB so I am not sure how gfetch works. Could you show the command you downloaded the data?
The UKBB genotype data is NOT downloadable anymore. gfetch is simply a UKBB provided tool to download big data, just like FTP.
Anyway, the full list of SNPs (including those for chromosome X, XY) were downloadable from this link https://biobank.ndph.ox.ac.uk/showcase/ukb/auxdata/ukb_snp_bim.tar, as written on this page https://biobank.ndph.ox.ac.uk/showcase/refer.cgi?id=1963.
For your convenience, I also posted chrY.bim and chrXY.bim at my Github https://github.com/jielab/001/tree/master/jie.
Can you please take a look at this SNP file and kindly let me know if Yhaplo could infer phylogeny from this data?
Thanks!
Jie
The tar archive has ukb_snp_chrY_v2.bim that has only 691 Y snps. I am not sure that is enough.
But they also have chrXY. Also, the imputed data has tens of thousands of SNPs for ChrX and ChrXY.
It would be great if you guys are interested in applying Yhaplo to UKBB data.
Best regards, Jie
Dear David:
Please refer to my post at PLINK users group: https://groups.google.com/g/plink2-users/c/Xvt895jb48w
It seems that there is no readl ChrY data from the UK Biobank genotype dataset, and therefore there is no way to run Yhaplo program on it, correct?
I run the following command on the UK Biobank ChrXY male data anyway: yhaplo -i chrXY-males.vcf.gz -o jie.
There is NO error message, and I got the output files as listed at this link https://github.com/jielab/001/tree/master/jie.
I was expecting to get a phylogentic tree dataset for all the male samples who have genotype data, but none of the output .awk file included such information.
So, please kindly advise how to run yhaplo on UK Biobank genetic data.
Thank you & best regards, Jie