hoffmangroup / segway

Application for semi-automated genomic annotation.
http://segway.hoffmanlab.org/
GNU General Public License v2.0
13 stars 7 forks source link

Allow GMTK to use NAN for observations #96

Open EricR86 opened 7 years ago

EricR86 commented 7 years ago

Original report (BitBucket issue) by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Segway should avoid filling in unnecessary 0's in observations and allow GMTK to use NAN for observations by enabling the necessary option.

EricR86 commented 7 years ago

Original comment by Rachel Chan (Bitbucket: rcwchan).


I think the filling in with 0's is done to facilitate the downsample_add function in observations.py, ie [3 3 NaN NaN NaN], with res 5 should be downsampled to [3]. So Segway fills in the NaN's with 0's, sums it ([3 3 0 0 0]) and then divides by the number of non-missing datapoints (2) and also weights it by this number. So it becomes [6]/2 = [3] with weight 2. To incorporate this change we'd have to change the downsample_add function too, I think.