rgeo / rgeo-proj4

Proj4 extension for rgeo.
MIT License
13 stars 14 forks source link

`RGeo::CoordSys::Proj4.transform_coords` slower in version 3.0.1 than version 2.0.1 #19

Closed x4d3 closed 2 years ago

x4d3 commented 2 years ago

Issue encountered

I've tried to update to the 3.0.1 version but I've observed that using RGeo::CoordSys::Proj4.transform_coords to project the coordinates of my geometry was slower than the 2.0.1 version.

Step to reproduce

Here is a minimal script to reproduce, you can uncomment to alternate between the 2 versions.

require "bundler/inline"

gemfile do
  source "https://rubygems.org"
  # gem "rgeo-proj4", "~> 3.0", ">= 3.0.1"
  gem "rgeo-proj4", "~> 2.0", ">= 2.0.1"
end

raise "RGeo::Geos is not supported" unless RGeo::Geos.supported?
puts "preferred_native_interface #{RGeo::Geos.preferred_native_interface}"
puts "rgeo-proj4 version #{RGeo::Proj4::VERSION}"
raise "proj4 is not supported" unless RGeo::CoordSys::Proj4.supported?
puts "proj4 version: #{RGeo::CoordSys::Proj4.version}"
puts "RUBY_VERSION: #{RUBY_VERSION}"
puts "RUBY_PLATFORM #{RUBY_PLATFORM}"
ws84_definition = "proj=longlat +datum=WGS84 +no_defs +type=crs"
# https://epsg.io/2154
paris_lambert_definition = "+proj=lcc +lat_1=49 +lat_2=44 +lat_0=46.5 +lon_0=3 +x_0=700000 +y_0=6600000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs +type=crs"

ws84_proj4 = RGeo::CoordSys::Proj4.create(ws84_definition)
raise "invalid ws84_definition" unless ws84_proj4

paris_lambert_proj4 = RGeo::CoordSys::Proj4.create(paris_lambert_definition)
raise "invalid paris_lambert_definition" unless paris_lambert_proj4

starting = Process.clock_gettime(Process::CLOCK_MONOTONIC)
1000.times do
  RGeo::CoordSys::Proj4.transform_coords(ws84_proj4, paris_lambert_proj4, 784020.1897824854, 6722964.293806808, nil)
end
puts "Processed in #{Process.clock_gettime(Process::CLOCK_MONOTONIC) - starting}"

With 2.0.1

preferred_native_interface capi
rgeo-proj4 version 2.0.1
proj4 version: 721
RUBY_VERSION: 2.6.6
RUBY_PLATFORM x86_64-darwin19
Processed in 0.0032739999987825286

With 3.0.1

preferred_native_interface capi
rgeo-proj4 version 3.0.1
proj4 version: 7.2.1
RUBY_VERSION: 2.6.6
RUBY_PLATFORM x86_64-darwin19
Processed in 1.5407219999979134

Versions

Note

I've tried to use WKT definition as well, it was faster but it was still slower than the 2.0.1 version

 paris_lambert_definition = <<~EOS
      PROJCS["RGF93 / Lambert-93",
          GEOGCS["RGF93",
              DATUM["Reseau_Geodesique_Francais_1993",
                  SPHEROID["GRS 1980",6378137,298.257222101,
                      AUTHORITY["EPSG","7019"]],
                  TOWGS84[0,0,0,0,0,0,0],
                  AUTHORITY["EPSG","6171"]],
              PRIMEM["Greenwich",0,
                  AUTHORITY["EPSG","8901"]],
              UNIT["degree",0.0174532925199433,
                  AUTHORITY["EPSG","9122"]],
              AUTHORITY["EPSG","4171"]],
          PROJECTION["Lambert_Conformal_Conic_2SP"],
          PARAMETER["standard_parallel_1",49],
          PARAMETER["standard_parallel_2",44],
          PARAMETER["latitude_of_origin",46.5],
          PARAMETER["central_meridian",3],
          PARAMETER["false_easting",700000],
          PARAMETER["false_northing",6600000],
          UNIT["metre",1,
              AUTHORITY["EPSG","9001"]],
          AXIS["X",EAST],
          AXIS["Y",NORTH],
          AUTHORITY["EPSG","2154"]]
    EOS

    ws84_definition = <<~EOS
      GEOGCS["WGS 84",
           DATUM["WGS_1984",
                 SPHEROID["WGS 84",6378137,298.257223563,
                          AUTHORITY["EPSG","7030"]],
                 AUTHORITY["EPSG","6326"]],
           PRIMEM["Greenwich",0,
                  AUTHORITY["EPSG","8901"]],
           UNIT["degree",0.0174532925199433,
                AUTHORITY["EPSG","9122"]],
           AUTHORITY["EPSG","4326"]]
    EOS

Result

preferred_native_interface capi
rgeo-proj4 version 3.0.1
proj4 version: 7.2.1
RUBY_VERSION: 2.6.6
RUBY_PLATFORM x86_64-darwin19
Processed in 0.4099999999998545

Docker image to reproduce

Here is a gist with a docker image where I've reproduced the issue with proj version:8.1.0

https://gist.github.com/x4d3/1dff5f2348a0a6ec1a6fd440d0d64ac7

keithdoggett commented 2 years ago

Thank you for the detailed/reproducible issue. I'm seeing a similar performance regression when I tested your script. I think I have an idea what it might be, but I'll do some digging and report progress as it comes up.

keithdoggett commented 2 years ago

So I looked into it and the cause is what I suspected. In newer versions of proj, you need to create a crs_to_crs projection and then transform coordinates with that, whereas in older versions you would pass 2 separate CRS's into the transform function. I'm not sure exactly how the internals worked, but I assume they cached the mapping somehow, but now they don't since they give us the crs_to_crs object back.

It takes about 0.002s for me to create the new crs_to_crs, so doing this 1000 times is ~2s.

I think the best solution to this is to do the following: 1) Create a new class RGeo::CoordSys::CRSToCRS that is instantiated from 2 RGeo::CoordSys::Proj4 objects and stores the crs_to_crs. 2) Create a singleton class RGeo::CoordSys::CRSStore that maps some combination of Proj4 objects to a CRSToCRS object. 3) When running transform_coords, check the CRSStore to see if the mapping exists, if not create it and store it, else get it. 4) Change the underlying C function to accept the crs_to_crs instead of 2 projections and just transform the coordinates.

Reference to see differences in C library between versions: https://proj.org/development/migration.html

x4d3 commented 2 years ago

Thanks @keithdoggett, I had a stab at fixing it in #20