Closed JoeCohen closed 2 weeks ago
@JoeCohen These values are getting rounded in the database.
mysql> update locations set east = -179.61345 where id = 732613413;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> select east from locations where id = 732613413;
+----------+
| east |
+----------+
| -179.613 |
+----------+
1 row in set (0.00 sec)
Is there reason to believe this is new behavior?
@mo-nathan Thanks for the rounding information! That's helpful. I have no reason to believe the db behavior is new. I'm probably doing something that -- for the first time -- depends on more precision. I'll see if I can fix that.
Note that there's rounding happening in the JS too, which eventually populates the DB.
Consider making an app-wide Rails constant for the number of decimal places we round to. Once we establish that, I can send that to the JS for consistency.
@nimmolo: app-wide rounding constant seems like a good idea. I think it should be in a different PR because (a) I don't want to stack too much stuff into this one; (b) I don't know how to do it or where to put it. Probably it should be derived from the db. I'm guessing (based on experimentation) that it's 4 digits to the right of decimal point for positive numbers, 3 for negative numbers. I also don't understand whether the db rounds it or just truncates it.
If the db were doing the rounding, i'm pretty sure it would be in our schema. It's not the db doing this though - it's our Location-saving methods that round the float.
I'm not aware of any code that rounds negative numbers differently - that's worth checking on. I'm also not aware of any truncation!
But there is more than one place in the code that rounds lat/lngs, so let's not forget to fix this error-prone situation. (I guess we could also change the schema through a migration - maybe the db is the place to do this.)
If we do a constant that is something I would put in a config with our other MO constants.
Let's get advice from @mo-nathan - he may have a preference.
Per your communication on Slack, I went ahead and changed to contains_point
on my PR, which I hope to merge today.
@nimmolo What evidence do you have that the fixtures are getting affected by any of our code? I did look at this and could not find anywhere that rounding was actually happening during a test run. This was using the test that uses the location fixture that Joe mentioned. I then found articles on the web about MySQL rounding FLOAT type values automatically and confirmed that with the test I gave above (that uses no Ruby at all).
Oh - it sounds like i misperceived the situation. I have no evidence of fixtures being affected.
My concern is more structural, and I wanted to raise a flag before I forgot about it again. I've just been working on location code for a few months and realizing that:
I recently changed all the float rounds to 4 places (some of mine from a couple years ago had been 6 places), and it occurred to me to use a constant, but i didn't go that far.
I agree that being consistent about the rounding would make sense. I did some more playing around with the lat/lng and on the production server we round to 6 decimal digits. This mean that somewhere around Kansas the longitude resolution switches from about 300' to about 30'. This can cause weirdnesses like this: https://mushroomobserver.org/locations/29140 (this is the house where I grew up). This seems wrong, but I'm not sure exactly what "right" is. That observation was created by searching Google for that address and then saving. While I was defining it, it looked like this:
It would at least be better if it saved as:
rather than
It seems like it would be be reasonable to support 7 digits (rather than 6) and then round off to the nearest 1/10,000s of degree (~30ft). This would not lose any existing data, but would start to treat the west coast the same as the east coast.
Note that my suggested solution involves a data migration to store higher precision floats in general. We might consider switching to the DECIMAL type. Not sure whether Rails likes that, but it seems like the right number format based on the MySQL documentation.
Re rounding:
(ruby) Location.where(Location[:east].eq(albion.east)).include?(albion)
false
@nimmo: Above implies that until/unless we switch to decimal, scope_contains_point
should have have something to compensate for db rounding, and should have some model unit tests. I'll have a look at your PR later today.
PS: @mo-nathan: Rails is okay with :decimal column type. https://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/SchemaStatements.html#method-i-add_column Question: Would we need to recast the existing db entries for affected columns?
Probably the best option would be to have a migration that creates a set of new decimal columns, copies the date to the new columns, drops the old columns, then renames the new columns to the old names. Obviously we’d have to test it, but at least in theory that should do it.
On Sat, Jul 6, 2024 at 5:44 PM Joseph D. Cohen @.***> wrote:
PS: @mo-nathan https://github.com/mo-nathan: Rails is okay with :decimal column type. https://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/SchemaStatements.html#method-i-add_column Question: Would we need to recast the existing db entries for affected columns?
— Reply to this email directly, view it on GitHub https://github.com/MushroomObserver/mushroom-observer/pull/2212#issuecomment-2211978002, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYQDUQL3MCYTDTIOWTNLOTZLBQJ7AVCNFSM6AAAAABKML7OXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJRHE3TQMBQGI . You are receiving this because you were mentioned.Message ID: @.***>
Hold on... let's not forget about Geometry columns!
I feel like that's the migration we want here. Almost totally forgot about that!
I researched all this awhile back. Having these bounding boxes as polygons, and observations as points, allows us to let the db find overlaps muuuuch more efficiently.
Potentially interesting: Rails - How to migrate code from float to decimal
Adds a new Location scope.
Location.contains_box(n: a, s: b, e: c, w: d)
returns all Locations which bound the defined rectangle.This is a small, but necessary step, in dealing with iNat places as part of importing iNat observations.