Closed djhoese closed 1 year ago
@sebastic FYI
@djhoese, I think you should bring this up on the PROJ list:
https://lists.osgeo.org/mailman/listinfo/proj
Specially the thread about PROJ 9.3:
https://lists.osgeo.org/pipermail/proj/2023-August/011116.html
$ docker run --rm -it osgeo/proj:latest bash
root@7aa0aa098f43:/# apt update && apt install python3-pip
root@7aa0aa098f43:/# python3 -m pip install pyproj
root@7aa0aa098f43:/# python3 -c "from pyproj import CRS; crs1 = CRS.from_proj4('+proj=laea +lat_0=90 +lon_0=0 +a=6371228.0 +units=m'); crs2 = CRS.from_epsg(3408); print(crs1.to_epsg() or "None"); print(crs2.to_epsg()); print(crs1 == crs2)"
None
3408
False
root@7aa0aa098f43:/# pyproj -v
pyproj info:
pyproj: 3.6.0
PROJ: 9.2.1
data dir: /usr/local/lib/python3.10/dist-packages/pyproj/proj_dir/share/proj
user_data_dir: /root/.local/share/proj
PROJ DATA (recommended version): 1.14
PROJ Database: 1.2
EPSG Database: v10.088 [2023-05-13]
ESRI Database: ArcGIS Pro 3.1 [2023-19-01]
IGNF Database: 3.1.0 [2019-05-24]
System:
python: 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
executable: /usr/bin/python3
machine: Linux-6.4.6-76060406-generic-x86_64-with-glibc2.35
Python deps:
certifi: 2023.7.22
Cython: None
setuptools: 59.6.0
pip: 22.0.2
@avalentino This example above doesn't seem to produce an EPSG code. And just like in the pyresample issue it lists PROJ 9.2.1 even though I think it should be using PROJ from the system. Maybe?
@djhoese to use libproj v9.3rc1 in debian you should enable "experimental" and then run:
# apt -t experimental install libproj-dev libproj25 proj-bin proj-data
EDIT: moreover probably it is better to use "debian/sid" of "debian/sid-slim" as docker image
Just install libproj-dev
& proj-bin
from experimental, the rest get pulled in via the dependencies.
I was able to build a version of pyproj from source (github from main
) and run the same command I showed above. Even the pyproj
command showed it was using PROJ 9.3.0, but still that crs1
did not produce an EPSG code.
Then I used debian:sid
, added the experimental repository (along with some certificate packages to get GPG to work), then installed libproj-dev
and proj-bin
. I then installed python3-pip
and python3-pyproj
. Running my example command from that showed that crs1
produced an EPSG code.
Then I uninstalled python3-pyproj
, installed git
, and did python3 -m pip install --break-system-packages --force-reinstall git+https://github.com/pyproj4/pyproj
. And this too produced an EPSG code for crs1
.
So...to me this means either:
There is a major difference between PROJ 9.3.0 in the PROJ images versus the experimental debian repos.
The PROJ images may have grids from PROJ-data installed which are not available in Debian.
Hhhmmm any idea if/why that would cause a CRS to not resolve to an EPSG code? If they were missing then I'd guess that an EPSG couldn't be found, but this is the opposite.
While digging into this, consider:
UserWarning: You will likely lose important projection information when converting to a PROJ string from another format.
PROJ strings are not good for comparing CRS.min_confidence
parameter in CRS.to_epsg. Set it to 100 to see if it is an exact match.
>>> crs1
<Projected CRS: +proj=laea +lat_0=90 +lon_0=0 +a=6371228.0 +units= ...>
Name: unknown
Axis Info [cartesian]:
crs2
Name: NSIDC EASE-Grid North Axis Info [cartesian]: - X[south]: Easting (metre) - Y[south]: Northing (metre) Area of Use: - name: Northern hemisphere. - bounds: (-180.0, 0.0, 180.0, 90.0) Coordinate Operation: - name: US NSIDC Equal Area north projection - method: Lambert Azimuthal Equal Area (Spherical) Datum: Not specified (based on International 1924 Authalic Sphere) - Ellipsoid: International 1924 Authalic Sphere - Prime Meridian: Greenwich ```
Thanks @snowman2! I know about the PROJ.4 string warning, but that's why I'm confused with this behavior. It is close enough to be considered the same EPSG with to_epsg
, but not when doing equality. I did not know about min_confidence
, but I would have assumed that in cases like this it would default to a value that would handle the round trip decently well.
The original issue started in Pyresample and how it bounces between EPSG and PROJ.4 strings when exporting to on-disk formats. I think bumping up the min_confidence
would be a "good enough" workaround.
I'm currently trying to build PROJ from source in the PROJ docker container and see if that gives me different results. I tried syncing the proj-data in the debian:sid image and that doesn't change the final result (crs1
still produces an EPSG code).
The min_confidence
defaults to 70, which I would have assumed to be equivalent. If you need to be able to roundtrip, I would change it to 100. Alternatively, WKT2 is also a good format for storing a CRS if you need to roundtrip.
If you are confident that the CRS are equivalent, I can help you create an issue upstream with PROJ to ask what their thoughts are on it.
I tried syncing the proj-data in the debian:sid image and that doesn't change the final result
That is only for transfomations. proj.db
is all you need for CRS information.
The min_confidence defaults to 70, which I would have assumed to be equivalent. If you need to be able to roundtrip, I would change it to 100. Alternatively, WKT2 is also a good format for storing a CRS if you need to roundtrip.
Sounds good. I wouldn't say I need roundtripping in normal user use cases, but the pyresample tests assume it can be done sometimes. I understand WKT2 would be good for a roundtrip, but human readability/parse-ability is also something we'd like to have.
The other question here is what is different between the two systems (debian:sid with experimental libproj-dev versus osgeo/proj:latest) that is causing an EPSG code to be used or not.
I don't recommend using osgeo/proj:latest
as it doesn't work how you would expect: https://github.com/pyproj4/pyproj/pull/1085
Hhhmmm in the 9.3.0 case it was showing the proper version.
I haven't been able to figure out what's going on here. I installed PROJ from github on both systems and pyproj from github on both and pointing to the PROJ which was in a custom prefix in PROJ_DIR
and I get an EPSG code for crs1
in the debian:sid container but not the osgeo/proj:latest
which I think is Ubuntu-based.
Bumping up min_confidence
from 70 to 71 causes there to be no EPSG code generated on the debian system.
@snowman2 Do you have a suggested way of investigating the proj.db
database? Or verifying that it is the same version between PROJ installations? I was about to post on the PROJ mailing list about this issue, specifically that one platform is getting an EPSG code and one is not. But then I realized I should know more about the proj_identify
function used by pyproj and about the proj.db used in each installation.
https://proj.org/en/9.2/development/reference/functions.html#c.proj_identify
Is there a way to update the proj.db
for an installation?
Or verifying that it is the same version between PROJ installations?
import pyproj
print(pyproj.show_versions())
or
python -m pyproj -v
Is there a way to update the proj.db for an installation?
That is not recommended as it may not be compatible with a different PROJ version.
Well when I do md5sum /path/to/proj.db
with installations from source (both from github and from the RC1 tarball posted on the mailing list) the md5sums are the same on a particular platform but different between them (debian versus the ubuntu-based osgeo/proj container). Any ideas what would lead to that? It would have to be some dependency or something available on one platform and not the other.
It would have to be some dependency or something available on one platform and not the other.
sqllite3 is the only one I can think of that would impact that.
@snowman2 I just did:
rm -r /usr/local/lib/python3.10/dist-packages/pyproj/proj_dir
And now I'm getting the EPSG code in the crs1
case (from PROJ.4). When does proj_dir
get created in the pyproj package and when does it use the system version?
Ok so I'm going with a corrupt installation. We've got a workaround for the EPSG roundtrip inequality from Even here: https://github.com/OSGeo/PROJ/pull/3879
If/when that makes it through then the pyresample tests should pass with it. @snowman2 @avalentino I don't think I want to change the to_epsg
confidence used in pyresample since the general purpose is to have it "magically" replace old PROJ.4 versions people were using that lacked datum information with some level of standard EPSG versions of these projections/CRSes. Thoughts?
When does proj_dir get created in the pyproj package and when does it use the system version?
This is what is included in the wheel. If you install pyproj with --no-binary pyproj
ref it will compile using the system PROJ and won't include proj_dir
.
I don't think I want to change the to_epsg confidence used in pyresample since the general purpose is to have it "magically" replace old PROJ.4 versions people were using that lacked datum information with some level of standard EPSG versions of these projections/CRSes. Thoughts?
I would be careful when "magically" replacing PROJ strings with EPSG versions. Unintended changes may occur that could result in differences in transformations. This is likely why Even is nervous about the change ref.
Point taken. The main purpose of the code in pyresample is to gently replace users understanding of their PROJ.4 string that they've used for the last 10 to 20 years and to recognize there is something more accurate and explicit in at least some kind of standard. By that I mean, many users probably copied PROJ.4 strings from the EPSG site many years ago only because that's what we supported back then. Now that EPSG strings/codes are actually supported we're "helping" the user realize that.
I think I can close this and when things break I'll come back and ask you to save us.
Code Sample, a copy-pastable example if possible
See full context here: https://github.com/pytroll/pyresample/issues/539
Problem description
The above code with PROJ 9.2.x does not produce an EPSG code for
crs1
. In PROJ 9.3, it does. The equality isFalse
in both cases...but I'm not sure it should be in the 9.3 case if PROJ 9.3 is able to think of them as equivalent EPSG codes.Expected Output
Either no EPSG code for
crs1
(the from PROJ.4 string) or equality between the two.NOTE: This has only been tested on PROJ 9.3 by @avalentino on their test system. I do not have 9.3 running locally (yet).
@snowman2 Can you help me file a bug with PROJ with a C example? Or do you not consider this a bug?
Environment Information
pyproj -v
python -m pyproj -v
python -c "import pyproj; pyproj.show_versions()"
python -c "import pyproj; print(pyproj.__version__)"
)python -c "import pyproj; print(pyproj.proj_version_str)"
)python -c "import pyproj; print(pyproj.datadir.get_data_dir())"
)python -c "import sys; print(sys.version.replace('\n', ' '))"
)python -c "import platform; print(platform.platform())"
)Installation method
Conda environment information (if you installed with conda):
Environment (
conda list
):Details about
conda
and system (conda info
):