Consider different architectures for greater performance

Currently this iterates over tiles (in parallel), uses an rtree to find all matching features, then builds each tile.

What about:

going through each feature once and deciding all the tiles it belongs to? (from the bbox, can figure that out for any zoom level cheaply)
iterating over the rtree structure directly (kind of like https://github.com/acteng/will-it-fit/blob/d91dbf0d70828d4b92a0de5f26a28bf274f89540/data_prep/dissolver/src/main.rs#L166)
follow the planetiler architecture (which has the best performance today for huge inputs, right?): loop over each feature, loop over each zoom level, writes encoded per-tile intermediate output somewhere. Then the sorting + feature dropping + compression + writing happens in a later pass.

Urban-Analytics-Technology-Platform / lines2pmtiles