choishingwan / PRS-Tutorial

A tutorial on how to run basic polygenic risk score analysis
MIT License
68 stars 104 forks source link

Correction for removing duplicated SNPs. #23

Closed MCorentin closed 3 years ago

MCorentin commented 3 years ago

The method to remove duplicated SNPs with grep has two issues:

Changed "grep -vf" to "awk '!seen[$3]++'" which will print each line, if the key $3 (column 3 = SNP ID) was not seen before.

MCorentin commented 3 years ago

Yes, the way you wrote it is more clear. There are a few typos ("aawk" instead of "awk") and ($1 instead of $3, to be consistent with the explanation below).

choishingwan commented 3 years ago

Thanks, that's fixed now