When VCF files are normalized, sometimes there are multiple IDs for each variant concatenated with a semicolon(;) THis however leads to problems with the #Uploaded_variation column in VEP. This PR regenerates the IDs and makes sure each variant gets a unique ID.
Related to Issue #76, specifically this comment https://github.com/PMBio/deeprvat/issues/76#issuecomment-2139008285
What
When VCF files are normalized, sometimes there are multiple IDs for each variant concatenated with a semicolon(;) THis however leads to problems with the
#Uploaded_variation
column in VEP. This PR regenerates the IDs and makes sure each variant gets a unique ID. Related to Issue #76, specifically this comment https://github.com/PMBio/deeprvat/issues/76#issuecomment-2139008285Testing
Tested locally on example data
Test scenarios
Run pipeline locally