SigProfilerMatrixGenerator creates mutational matrices for all types of somatic mutations. It allows downsizing the generated mutations only to parts for the genome (e.g., exome or a custom BED file). The tool seamlessly integrates with other SigProfiler tools.
BSD 2-Clause "Simplified" License
101
stars
37
forks
source link
mysterious hyphens when processing INDELs from ICGC data #159
Hello,
why are the hyphens added to ref and mut when the other functions don't do similar actions? This breaks downstream because they are added again in MutationMatrixGenerator.py (lines 1176-1179) and then you can get a KeyError at line 1617 revcompl(type_sequence) because the '-' character is not in the revcompl map.
i fixed this by commenting out the lines in convert_input_to_sample_files, but can someone explain if this will have unintended consequences?
Thanks for reaching out again about the issue you encountered with ICGC input files. It would be a great help if you could please provide an input file to reproduce the issue you identified. Thanks!
https://github.com/AlexandrovLab/SigProfilerMatrixGenerator/blob/f945199230a4fc0671d90a7873b079930a84d227/SigProfilerMatrixGenerator/scripts/convert_input_to_simple_files.py#L332C10-L332C10
Hello, why are the hyphens added to
ref
andmut
when the other functions don't do similar actions? This breaks downstream because they are added again in MutationMatrixGenerator.py (lines 1176-1179) and then you can get a KeyError at line 1617revcompl(type_sequence)
because the'-'
character is not in the revcompl map.i fixed this by commenting out the lines in convert_input_to_sample_files, but can someone explain if this will have unintended consequences?
thanks, Marc