pjgreer / ukb-rap-tools

Scripts and workflows for use analyzing UK Biobank data from the DNANexus Research Analysis Platform
37 stars 8 forks source link

analysis #13

Closed TrumanZYX closed 11 months ago

TrumanZYX commented 11 months ago

Dear, so just follow the previous plan for analysis

pjgreer commented 11 months ago

It is still feasible, but since the TOPMED data is in GRCH38 format, you need to perform the liftover scripts from GTfile_liftover as well, just as if you were analyzing the WES data.

It has been a while since I ran REGENIE on the WES data, but I do not think the ID mismatch is a problem provided they are both in the same reference build. If it is an issue, I would add the following 2 flags to the last liftover script to make the variant IDs match the format of the TOPMED IDs. These flags were used in the TOPMED QC script ( gwas_topmed_plink/11a-gwas-s2-topmed-qc-filter.sh )

--set-missing-var-ids @:#:'\$r':'\$a' --new-id-max-allele-len 99 truncate