bluefeet / Geo-Distance

Calculate distances and closest locations. (DEPRECATED)
https://metacpan.org/pod/Geo::Distance
Other
6 stars 5 forks source link

Geo::Distance was still faster than GIS::Distance::Fast #14

Closed gray closed 5 years ago

gray commented 5 years ago

I think gutting Geo::Distance was premature. Until you can get GIS::Distance to achieve similar performance, you're forcing users to accept a massive performance hit.

See my benchmarking script here: https://metacpan.org/source/GRAY/Geo-Distance-XS-0.13/ex/benchmark.pl

The numbers published here: https://metacpan.org/pod/Geo::Distance::XS were for Perl 5.14.2, but running against Perl 5.28.1 the numbers are slightly better, but still are at least an order of magnitude better for Geo::Distance::XS. And note, that your pure-perl Geo::Distance is faster than your GIS::Distance::Fast in all but one case.

bluefeet commented 5 years ago

I'm aware, but thanks. Working on this all today.

bluefeet commented 5 years ago

Calls to GIS::Distance::Fast::haversine_distance() vs the Geo::Distance::XS::distance() (so both are direct calls into the C functions) were already faster when I ran my own benchmarks today.

But then if I stepped back and benchmarked through the whole call stack of GIS::Distance->distance() it was slower.

I've shored this up with bluefeet/GIS-Distance-Fast@9361d29634f6f98422bb40f618ca53081359bc03 which reduces the call stack to like two, and moves one tiny calculation into C.

I've also added some documentation for users that really care about speed in bluefeet/GIS-Distance@35fe0b6d0d509a98970671a988fe3ec0c8c08b92.

Will be looking at more of this soon.

bluefeet commented 5 years ago

Allright, got some really thorough benchmarks going now at: https://github.com/bluefeet/GIS-Distance/blob/master/author/bench

Along the way I made some obvious speedups in GIS::Distance, GIS::Distance::Fast, and Geo::Distance.

And the result for me: https://gist.github.com/bluefeet/c0b8c8fc6fe0c5fde4e1e66492a15e0c

They cover every entry point I could fathom. I'm happy with this overall, but I would like to dig deeper into these differences:

GIS::Distance::distance-pp                   156299/s
Geo::Distance::old_distance-geo_pp           285225/s

And:

GIS::Distance::distance-xs                   329815/s
GIS::Distance::distance_km-xs               1984127/s
Geo::Distance::distance-geo_xs              3164557/s

In both cases I'd like GIS::Distance to be as fast or faster.

Hard to beat Geo::Distance::distance() which does direct-to-XS without any sort of argument checking or support for PP-only formulas. But I think I have an idea. ;)

I'm not terribly concerned that Geo::Distance::new_distance() is slow. If people are depending on ultra speed they can modify their code to use GIS::Distance directly where any further improvements will likely be found.

bluefeet commented 5 years ago

Allright! I added distance_metal() to GIS::Distance and it is, with other improvements, just about the same speed as Geo::Distance::XS::distance(). So, yes, GIS::Distance::distance() is still a slow hog, but the user now has options if they want to bypass things like automatic unit conversion, formula arguments, argument checking, and PP support when XS is available. I like that distance() is the full-featured version and that there are now variants, distance_km() and distance_metal(), for specialized cases.

I also imported some small bits from Geo::Distance::XS which helped GIS::Distance::Fast some.

The fastest option, calling the distance() function in the GIS::Distance formula module directly, is the fastest of all and is almost 3x faster than the fastest Geo::Distance/Geo::Distance::XS entrypoints.

There is definitely some work I can do on distance() itself to make it faster - namely dig into Class::Measure, which is in progress.

bluefeet commented 5 years ago

Welp, I'm closing this one. I think that GIS::Distance is up-to-snuff at this point.