becheran / fast-hilbert

Fast Hilbert space-filling curve transformation using a LUT
MIT License
38 stars 3 forks source link

90 re-orientation on iteration breaks locality property. #7

Closed hallahan closed 1 year ago

hallahan commented 1 year ago

I've been using this (awesome) library for generating Hilbert locations in building a map tiling engine. As the README states, for every zoom, the Hilbert curve rotates. A core property of the Hilbert curve is that relative position of items in the curve is maintained as the zoom changes. For example, this property should hold:

lower_zoom_hilbert = hilbert / (4^n) * (4^n-1)

Or more nicely:

lower_zoom_hilbert = hilbert >> 2

Because the rotation happens, these nice properties do not apply. I instead have to calculate a Hilbert location at a higher zoom and shift down for a lower zoom tile.

Is it possible to tweak the LUT so that this rotation does not happen?

z3 x4 y3 h31 z2 x2 y1 h13 z1 x1 y0 h1

Here we are with a tile at zoom 1. The hilbert location resolves to 1.

Screen Shot 2022-10-28 at 9 12 46 PM

At zoom 2, the hilbert location resolves to 13 due to the rotation.

Screen Shot 2022-10-28 at 9 15 07 PM

If I were to sort geographic data by their Hilbert location at one zoom, the locality will change for another, negating the static quadtree.

And then one more zoom in, and the direction of the preceding tile comes from the north.

Screen Shot 2022-10-28 at 9 18 38 PM
becheran commented 1 year ago

Hey, Great to see that people are using this library. You could give version 0.1.0 a chance. I started this crate using the "real" hilbert curve. The 90 degree rotation was later added via a PR.

I see how the property is broken. I will think about adding both options.

hallahan commented 1 year ago

I tried out the older version, and yes, we are now getting an actual Hilbert curve. For me to use this library, I need to be able to get a real Hilbert location, as you say. The h value is essentially the offset from the origin of the curve, and the xy at a different curve path has a very different meaning.

I see the clever use of looking at leading zeros to determine the order. What does that mean though? Perhaps right now the implied order is the bit size of the xy type you are using? For example, I am using u32, so I am actually getting an h at order 32?

It's interesting that with how it is now (latest version), if we feed the function a small xy, with an intended order of 32, we still loop many less times? Interesting potential optimization?

Using the latest version, if we just shift the result by the delta of the order we want and 32, I think we get the right h.

let delta = 32 - z;
let h32 = xy2h(x, y);
let h = h32 >> delta * 2;
hallahan commented 1 year ago

On another note, I think there is a bug in the h2xy assertion in the old version.

If I call xy2h where the order is 9, and x is 82 and y is 199, the result is 52017.

When inputting h2xy(52017, 9), I get a panic, because the assert is:

let max: u64 = (1 as u64) << (order + 1);
 assert!(h < max, "Input must be < (2^order) * 2");

I don't think that is correct for the h value. This would apply to an x or y value.

It should probably be:

(1 as u64) << (order * 2)
becheran commented 1 year ago

The leading zero count magic is only possible because the order is not passed as snd argument. Without the curve rotation of 90 deg the h output for a given xy will always be the same.

Performance wise there is no measureable drawback. The zero count is almost free and the conversion will only look at the meaningful bits.

I see two options to fix the curve. Either pass the order as snd argument again and do it as before. Or only pass an additional bool which just indicates if the order is even or odd. Because this is the only information missing in order to draw the curve correctly.

Using the oder would have the disatwantage of panics if order > size. But maybe it is just easier to understand?

hallahan commented 1 year ago

How about just pass in the order and then order % 2 ?

becheran commented 1 year ago

@hallahan yea... right. Why complicated if the solution can be so easy. Was a bit blind.

But the zero counting magic is actually also a performance boost for lower numbers. I now changed the impl with a second order value, but I will always only look at the last bit of that value and therefore not panic for any input. Re-ran the benchmark and the performance did not change.

Will release a 2.x version with this breaking change (PR #9)

hallahan commented 1 year ago

Yes, it's a clever performance boost for sure. I bet you could brag about it with some benchmarks.

becheran commented 1 year ago

Fixed with version 2.0.0