I discovered when updating 1000 genomes data that, map-using-rsid updates 2 different rsids to the same rsid. For example,
rs566297287 rs368250985
rs534967437 rs368250985
Because of the order of operations to update the plink files is delete rsids, update rsids, coord update, and chromosome update, we receive an error after updating the bim file with the updated rsids txt file.
Error: Duplicate ID 'rs368250985'.
Possibly solution: keep track of the rsids being updated and if one rsid is already being updated to a seen rsid, delete it (keep one copy)
I discovered when updating 1000 genomes data that,
map-using-rsid
updates 2 different rsids to the same rsid. For example,Because of the order of operations to update the plink files is delete rsids, update rsids, coord update, and chromosome update, we receive an error after updating the bim file with the updated rsids txt file.
Possibly solution: keep track of the rsids being updated and if one rsid is already being updated to a seen rsid, delete it (keep one copy)