gymreklab / GangSTR

A tool for profiling long STRs from short reads
GNU General Public License v2.0
85 stars 16 forks source link

Key Error from REF and ALT fields when running DumpSTR on GangSTR vcf file #81

Open BonnieCSE opened 4 years ago

BonnieCSE commented 4 years ago

DumpSTR throws KeyError: 'csf1poatct' when running on a GangSTR vcf file (Platinum Genomes pedigree). I noticed the error was due to strings such as ‘csf1poatct,’ ‘d7s820tatc,’ and ‘d8s1179tcta’ in the REF and ALT fields in chr5 pos 149455887, chr7 pos 83789542, and chr8 pos 125907115.

nmmsv commented 4 years ago

Hi Bonnie, Did we figure this out? I remember you mentioned it before but forgot if it was addressed or not.

BonnieCSE commented 4 years ago

Hi Nima, I don't think we've figured it out yet. I think the Key Errors are related to codis markers? -Bonnie

On Thu, Feb 13, 2020 at 10:50 AM Nima Mousavi notifications@github.com wrote:

Hi Bonnie, Did we figure this out? I remember you mentioned it before but forgot if it was addressed or not.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/gymreklab/GangSTR/issues/81?email_source=notifications&email_token=ALW3KKWOG6SOIJGVBK5MN6TRCWI6XA5CNFSM4JMMWG7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELWFS6A#issuecomment-585914744, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALW3KKT5OZESCP4J4INWC7TRCWI6XANCNFSM4JMMWG7A .

nmmsv commented 4 years ago

O cool I remember now. If you have the bed file that created this issue can you add it here? That way I can reproduce the error and take it from there!

BonnieCSE commented 4 years ago

Is it ok if I send the vcf file? I can't seem to find the bed file.

chr5_subset.vcf is a subset of chr5 (contains one of the 3 lines that cause the error), and when it is the input vcf for dumpSTR, the error is thrown. plat_merged_errors.vcf contains all 3 of the lines that cause dumpSTR to throw the Key Error.

chr5_subset.vcf.gz plat_merged_errors.vcf.gz

nmmsv commented 4 years ago

awesome, thanks! I'll check them out later.