tidwall / geojson

GeoJSON for Go. Used by Tile38
MIT License
130 stars 28 forks source link

Questions on `DefaultIndexOptions` #23

Closed ringsaturn closed 1 year ago

ringsaturn commented 1 year ago

Hi, thanks a lot for this fantastic package!

I have a question about the DefaultIndexOptions which defined use QuadTree:

https://github.com/tidwall/geojson/blob/ad814f6c693f7f60e025596499c5703d1480d2b7/geometry/series.go#L39-L42

I try to setup a benchmark in CI, here is a sample which shows RTree based index seems faster:

=== RUN   TestBigArizona
az/none     1582 points created in 106.201µs using 27344 bytes
az/none/in  100,000 ops over 4 threads in 399ms, 250,372/sec, 3994 ns/op
az/none/on  100,000 ops over 4 threads in 197ms, 508,399/sec, 1966 ns/op
az/none/out 100,000 ops over 4 threads in 403ms, 247,900/sec, 4033 ns/op
az/none/rnd 100,000 ops over 4 threads in 434ms, 230,199/sec, 4344 ns/op
az/quad     1582 points created in 336.003µs using 32232 bytes
az/quad/in  100,000 ops over 4 threads in 38ms, 2,615,503/sec, 382 ns/op
az/quad/on  100,000 ops over 4 threads in 17ms, 5,775,986/sec, 173 ns/op
az/quad/out 100,000 ops over 4 threads in 30ms, 3,289,053/sec, 304 ns/op
az/quad/rnd 100,000 ops over 4 threads in 46ms, 2,196,806/sec, 455 ns/op
az/rtre     1582 points created in 874.808µs using 39656 bytes
az/rtre/in  100,000 ops over 4 threads in 25ms, 4,021,162/sec, 248 ns/op
az/rtre/on  100,000 ops over 4 threads in 8ms, 12,413,302/sec, 80 ns/op
az/rtre/out 100,000 ops over 4 threads in 24ms, 4,147,342/sec, 241 ns/op
az/rtre/rnd 100,000 ops over 4 threads in 32ms, 3,130,243/sec, 319 ns/op
--- PASS: TestBigArizona (1.69s)
=== RUN   TestBigTexas
tx/none     12478 points created in 276.703µs using 204880 bytes
tx/none/in  100,000 ops over 4 threads in 4655ms, 21,484/sec, 46545 ns/op
tx/none/on  100,000 ops over 4 threads in 2098ms, 47,672/sec, 20976 ns/op
tx/none/out 100,000 ops over 4 threads in 4550ms, 21,977/sec, 45501 ns/op
tx/none/rnd 100,000 ops over 4 threads in 4758ms, 21,019/sec, 47575 ns/op
tx/quad     12478 points created in 3.578431ms using 245864 bytes
tx/quad/in  100,000 ops over 4 threads in 116ms, 863,551/sec, 1158 ns/op
tx/quad/on  100,000 ops over 4 threads in 131ms, 760,742/sec, 1314 ns/op
tx/quad/out 100,000 ops over 4 threads in 113ms, 886,719/sec, 1127 ns/op
tx/quad/rnd 100,000 ops over 4 threads in 115ms, 869,634/sec, 1149 ns/op
tx/rtre     12478 points created in 8.743577ms using 295016 bytes
tx/rtre/in  100,000 ops over 4 threads in 60ms, 1,662,062/sec, 601 ns/op
tx/rtre/on  100,000 ops over 4 threads in 71ms, 1,405,447/sec, 711 ns/op
tx/rtre/out 100,000 ops over 4 threads in 58ms, 1,720,619/sec, 581 ns/op
tx/rtre/rnd 100,000 ops over 4 threads in 61ms, 1,631,339/sec, 612 ns/op
--- PASS: TestBigTexas (16.82s)

So default option use QuadTree because it used less memory than RTree and not too slower compared with RTree and use QuadTree could reduce index time when series created?

tidwall commented 1 year ago

Yeah. It's a tradeoff. I defaulted to the index with faster construction and lower memory usage.

A quadtree index is about 2x faster construction, but 2x slower searches, than an rtree index.

ringsaturn commented 1 year ago

Yeah. It's a tradeoff. I defaulted to the index with faster construction and lower memory usage.

A quadtree index is about 2x faster construction, but 2x slower searches, than an rtree index.

Thanks, I will try this option in my projects.