thomasvangurp / epiGBS

Code for working with epiGBS data
MIT License
10 stars 7 forks source link

Spiking pattern in read coverage methylation_calling.py #26

Closed MaartenPostuma closed 4 years ago

MaartenPostuma commented 4 years ago

The following line of code in methylation_calling,py causes the number of reads to be inverted. (i.e. 12 becoming 21, 13 becoming 31 etc.)

[[out_line.append(int(obs.split(',')[0])) for nt, obs in zip('TGCA', obs[::-1].split(':'))
          if nt in 'GA'] for obs in split_line[5:]]

This causes the spikes in coverage numbers at 11 21 31. This can be fixed using the following code instead:

        for obs in split_line[5:]:
            for nt, obs in zip('TGCA',obs[::-1].split(':')):
                if nt in 'GA':
                    out_line.append(int(obs.split(',')[0][::-1]))