Closed bobular closed 3 years ago
@bobular Is this still an issue? If so, how do you suggest we fix? Looks like if, for non-integer numbers, we (i.e. either Java code or loading team) adjust the min and max to be lower/higher respectively by 0.00001, we will pick up the extra values? If this is the solution, at what stage do you think we should perform the adjustment? Also tagging @steve-fischer-200 since he may have an opinion.
@ryanrdoherty why is leftmost binStart not equal to rangeMin?
As I commented on https://epvb.slack.com/archives/C01894F7P89/p1632869876261800?thread_ts=1632851456.237200&cid=C01894F7P89 - I think this is not a problem with the distribution service. I think there's still an issue with the rounding of variable annotations rangeMin
and rangeMax
This ticket may need revisiting https://github.com/VEuPathDB/EdaDataService/issues/76
Current plan is to see if the improvements to the subsetting service (integer refinement and float -> double conversions) solve this. Will comment again for QA when these changes are merged to main.
Just checked this on QA. Looks like the recent improvements to floating-point number handling have fixed this issue. We now see all records that have values when selecting the whole graph, and the range (lower bound) is correct.
As @ryanrdoherty said, it looks good now. Closing ticket.
I know we've had similar discussions w.r.t. to
variable.rangeMin
(see below) but I think this is a separate one.Here's how to reproduce the issue:
variable.rangeMin
(-2.987527924) which then gives the correct total of 14,241 - (yay, thanks to Jay and co fixing this issue)So the issue here is that the histogram returned by
distribution
is missing some counts, most likely due to the rounding of the binStart