hoffmangroup / segway

Application for semi-automated genomic annotation.
http://segway.hoffmanlab.org/
GNU General Public License v2.0
13 stars 7 forks source link

--split-sequences specifies number of GMTK frames and not number of base pairs #56

Closed EricR86 closed 8 years ago

EricR86 commented 8 years ago

Original report (BitBucket issue) by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Currently in the --help, --split-sequences is defined as follows:

    -S SIZE, --split-sequences=SIZE
                        split up sequences that are larger than SIZE bp
                        (default 2000000)

Where "bp" stands for "base pair". However currently in the code --split-sequences clearly specifies the number of GMTK frames instead which are not equivalent for cases where the resolution is larger than 1:

In segway/run.py line 656 where the options are copied over to python attributes:

                        ("split_sequences", "max_frames"),

and in segway/run.py line 1231:

        directives["CARD_FRAMEINDEX"] = self.max_frames
EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


I should also note that the default for --split-sequences is defined by MAX_FRAMES:

        group.add_option("-S", "--split-sequences", metavar="SIZE",
                         default=MAX_FRAMES, type=int,
                         help="split up sequences that are larger than SIZE "
                         "bp (default %s)" % MAX_FRAMES)
EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Fixed in pull request #44