Closed Kirill888 closed 2 years ago
🚀 Deployed on https://63567cf43b1cf12452d8e8c1--odc-geo-docs.netlify.app
@snowman2 how about this change? Do you think that will fix performance issues you have observed with non-epsg SRSs?
🚀 Deployed on https://6356801a375cb12ba573386b--odc-geo-docs.netlify.app
Base: 98.19% // Head: 98.14% // Decreases project coverage by -0.05%
:warning:
Coverage data is based on head (
01de66e
) compared to base (85c90a2
). Patch coverage: 94.44% of modified lines in pull request are covered.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Part of the reason is False
was used in #70 was because it skips the equality check as it is not necessary. Using ==
instead of is
invokes the __eq__
method of the CRS class which attempts to create a CRS. This can cause a performance hit as PROJ has to go through a lot of logic before determining that it isn't a valid CRS.
In [1]: from pyproj import CRS
In [2]: cc = CRS(4326)
In [4]: %%timeit
...:
...: cc == -1
...:
...:
7.54 ms ± 252 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [5]: %%timeit
...:
...: cc is False
...:
...:
59.1 ns ± 5.09 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
Do you think that will fix performance issues you have observed with non-epsg SRSs?
Yes, I believe it will help. Thanks :+1:
Part of the reason
is False
was used in #70 was because it skips the equality check as it is not necessary. Using==
instead ofis
invokes the__eq__
method of the CRS class which attempts to create a CRS. This can cause a performance hit as PROJ has to go through a lot of logic before determining that it isn't a valid CRS.
My main issue with using False|None|<int>
is that it complicates type a bit, and it's not really obvious which is which between None
and False
. The CRS.__eq__
should not trigger with the current implementation as comparison is done with None|int
type
I gave it a try:
In [1]: from odc.geo import CRS
In [2]: cc = CRS(4326)
In [3]: %%timeit
...:
...: cc == -1
...:
...:
5.07 ms ± 202 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [4]: %%timeit
...:
...: cc is False
...:
...:
31.4 ns ± 3.22 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
Looks like it attempts to construct a CRS if it is not a CRS each time: https://github.com/opendatacube/odc-geo/blob/85c90a232d0888408dd1e53e1a913239d6480cf9/odc/geo/crs.py#L238-L243
Actually, need to update to use this PR ...
You are correct, this seems okay:
In [5]: %%timeit
...:
...: cc.to_epsg()
...:
...:
141 ns ± 1.87 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
And this looks good too:
In [7]: %%timeit
...:
...: CRS('PROJCS["unnamed",GEOGCS["Unknown datum based upon the custom spher
...: oid",DATUM["Not specified (based on custom spheroid)",SPHEROID["Custom
...: spheroid",6371007.181,0]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532
...: 925199433]],PROJECTION["Sinusoidal"],PARAMETER["longitude_of_center",0]
...: ,PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["Meter
...: ",1],AXIS["Easting",EAST],AXIS["Northing",NORTH]]')
...:
...:
1.06 µs ± 19.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
🚀 Deployed on https://635709f53bbf7434d6c7ffdc--odc-geo-docs.netlify.app
Looks like it attempts to construct a CRS if it is not a CRS each time:
I guess we could avoid that when rhs
is int
, and maybe avoid it when rhs
is string.
Although, with every input into CRS constructor going through the cache path, it's less of an issue.
🚀 Deployed on https://635710c58aebe93469dc5426--odc-geo-docs.netlify.app
Introduce special state _epsg=0, that implies "epsg code is not known yet",
_epsg=None
means "epsg code is known to be absent".Force all CRS construction code paths through cache.