Closed ArtPoon closed 10 years ago
Looks like a frameshift slipped into the sample consensus for RT at the remap stage. Investigating.
This sample has a single bp deletion of RT 2861A in nearly all reads covering this interval, leading to a frameshift. This is based on the preliminary map so it seems to be real. Going to the raw data, grepping this interval with deletion returns 14,878 reads. Without the deletion returns nothing.
Proposed solution: Add deletions to consensus sequence in pileup_to_conseq() in remap.py. Cull codon deletions (3 gaps in a row) before returning sequence.
Regex substitution fails when no groups are matched:
2014-08-27 14:36:40.812886 - [ERROR] conseqs[refname] = pileup_to_conseq(f, consensus_q_cutoff)
2014-08-27 14:36:40.813419 - [ERROR] File "./remap.py", line 397, in pileup_to_conseq
2014-08-27 14:36:40.813650 - [ERROR] conseq = re.sub(pat, r'\g<1>\g<3>', conseq)
2014-08-27 14:36:40.813864 - [ERROR] File "/usr/local/lib/python2.7/re.py", line 151, in sub
2014-08-27 14:36:40.814067 - [ERROR] return _compile(pattern, flags).sub(repl, string, count)
2014-08-27 14:36:40.814278 - [ERROR] File "/usr/local/lib/python2.7/re.py", line 275, in filter
2014-08-27 14:36:40.814474 - [ERROR] return sre_parse.expand_template(template, match)
2014-08-27 14:36:40.814669 - [ERROR] File "/usr/local/lib/python2.7/sre_parse.py", line 789, in expand_template
2014-08-27 14:36:40.814879 - [ERROR] raise error, "invalid group reference"
2014-08-27 14:36:40.815084 - [ERROR] sre_constants.error: invalid group reference
Forgot to add groups for matching [ACGT] enclosing codon deletion. Fixed in next commit.
Looks like another failed alignment: