hoffmangroup / segway

Application for semi-automated genomic annotation.
http://segway.hoffmanlab.org/
GNU General Public License v2.0
13 stars 7 forks source link

Segway should get the resolution used for training for identification #63

Open EricR86 opened 8 years ago

EricR86 commented 8 years ago

Original report (BitBucket issue) by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Resolution should only be specified for training and identify should not depend on it being specified. The parameters need to match and this should be found out somehow from the training directory.

EricR86 commented 8 years ago

Original comment by Michael Hoffman (Bitbucket: hoffman, GitHub: michaelmhoffman).


Is this not stored in train.tab? It should be stored and used there.

EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


@rcwchan I believe you were the first one to get this issue of mismatching resolutions and GMTK complaining about the cardinality of the presence data. Can you figure out how to reproduce this and post it here?

EricR86 commented 8 years ago

Original comment by Rachel Chan (Bitbucket: rcwchan).


@ericr86 I can't seem to reproduce it with simplesemisupervised or simpleseg, but segway doesn't complain when I specify different resolutions for training and identification (which was how I got the bug last time). At that point in time, I did not know I didn't have to specify the resolution when identifying (and accidentally put res=10 instead of res=1 or something).

Not needing to re-specify the resolution is not mentioned in the documentation either.

From debugging the code, it looks like if a different resolution is specified for identification, it overwrites the one inferred from training. Should there be a check in place?

EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Yes this sounds like the bug that's currently in place.