golang / geo

S2 geometry library in Go
Apache License 2.0
1.69k stars 182 forks source link

Question: Overflow converting `CellID.Pos()` to float64 is possible? #66

Closed amarjeetanandsingh closed 4 years ago

amarjeetanandsingh commented 4 years ago

Hi

I have to convert the latlongs of a polyline into list of s2CellId Pos(uint64) and store the POS() value in redis as sorted set(which needs value as float64) Now I am a bit afraid of overflow while converting POS from uint64 to float64 because below code overflows -

    numUint64 := math.MaxUint64
    numFloat64 := float64(numUint64)

However, the documentation of CellID.Pos() confirms the range as [0,2^posBits-1]. If we calculate, posBits = 2<<maxLevel + 1, given maxLevel = 30 below code works fine without overflow.

        posBits := (2<<30+1) - 1
    floatEq := float64(posBits)

Now, is it safe to say CellID.Pos() conversion to float64 will never overflow?

dsymonds commented 4 years ago

As the doc comment suggests, it'll be in the range [0,2^posBits-1]. However, posBits is 2*maxLevel + 1 = 61, so the max value is 2^61 - 1.

You say "overflow", but the range of float64 is bigger than uint64. However, the risk of uint64->float64 is not overflow, but loss of precision, which happens around 2^53 (https://en.wikipedia.org/wiki/Double-precision_floating-point_format), so storing a cell ID's position in a float64 will indeed lose precision.

amarjeetanandsingh commented 4 years ago

Hi @dsymonds Thanks for the response.

Ok. I understood there will be loss of precision after 2^53. So, that means it will give correct result till level 26, isn't it?

As https://s2geometry.io/resources/s2cell_statistics.html page suggests, level 26 will have precesion in cm^2. My application doesn't handle data at that precision. Even error of couple of meters is fine.

So am i safe using the POS() as float?

dsymonds commented 4 years ago

I think you need to evaluate it yourself for your specific use case. What "safe" means is application dependent. If you can tolerate loss of precision then it might be fine, but note that Pos is the position along the Hilbert curve, and while nearby Pos values are usually nearby locations, they aren't necessarily so. If you're wedded to Redis, it might be better to use their native Geohash (https://www.compose.com/articles/a-quick-guide-to-redis-3-2s-geo-support/).

amarjeetanandsingh commented 4 years ago

Thanks @dsymonds

amarjeetanandsingh commented 4 years ago

Hi @dsymonds Just one last thing.

and while nearby Pos values are usually nearby locations, they aren't necessarily so.

So suppose, i have 3 points, a,b,c with pos = 1, 10, 100.

Doesn't it guarantee that distance between point b and a will always be less than distance between point b and c?

If it doesn't guarantee, could you please refer me to appropriate document to know more about POS and the distance between points in a bit more detail?

dsymonds commented 4 years ago

The s2geometry.io site explains the Hilbert curve in some detail, but in short, no, the triangle inequality is not preserved on it.