PacificBiosciences / kineticsTools

Tools for detecting DNA modifications from single molecule, real-time sequencing data
19 stars 21 forks source link

the default value of the argument "maxCoverage" #51

Closed PengNi closed 7 years ago

PengNi commented 7 years ago

Hi all, I was wondering if the default value of the argument "maxCoverage" in ipdSummary.py line 285 is set properly?

The default value of "maxCoverage" is set to be -1.

self.parser.add_argument('--maxCoverage',
                                 dest='maxCoverage',
                                 type=int, default=-1,
                                 help='Maximum coverage to use at each site')

However, if this argument is set to -1 by default, in the following codes in KineticWorker.py line 501,

        # If the user has specified a maximum coverage level to use, enforce it here -- just take the first n reads
        if self.options.maxCoverage is not None:
            maxCov = self.options.maxCoverage
            for x in views:
                d = x['data']
                d = d[0:maxCov]
                x['data'] = d

the d = d[0:maxCov] which becomes d=d[0:-1], will not assign all n values in d but only the first n-1 values.

So Is it reasonable?

PengNi commented 7 years ago

Can somebody help? @nlhepler @dalexander @natechols

nlhepler commented 7 years ago

Hi @PengNi,

You appear to have uncovered a bug! Indeed, we would be losing a single read of coverage using the default values, which I am sure is unintended.

I will draft a patch to fix this, in the meantime please understand that the loss of a single read is unlikely to have a dramatic effect on the outcome.

PengNi commented 7 years ago

@nlhepler Thanks, got that.