pyproj4 / pyproj

Python interface to PROJ (cartographic projections and coordinate transformations library)
https://pyproj4.github.io/pyproj
MIT License
1.06k stars 214 forks source link

CA-bundle environment variable has no effect #1233

Closed trexfeathers closed 1 year ago

trexfeathers commented 1 year ago

Code Sample, a copy-pastable example if possible

# MAKE SURE YOU CLEAR YOUR LOCAL PROJ CACHE BETWEEN RUNS - OTHERWISE IT WILL
#  ALWAYS WORK AFTER THERE HAS BEEN ONE SUCCESS.

from os import environ

# Still get SSL certificate problem:
# environ["PROJ_CURL_CA_BUNDLE"] = environ["SSL_CERT_PATH"]
#  Same problem when set in the terminal before running Python.
#  Same problem with CURL_CA_BUNDLE / SSL_CERT_FILE .

from pyproj.crs import CRS
from pyproj.transformer import Transformer
from pyproj.network import set_ca_bundle_path

# Allows successful transformation:
# set_ca_bundle_path(environ["SSL_CERT_PATH"])

# Transform between sufficiently different systems that PROJ wants to use
#  online resource.
src_crs = CRS("+proj=tmerc +datum=OSGB36 +lon_0=-2 +lat_0=49 +x_0=400000 +y_0=-100000")
tgt_crs = CRS("+proj=lonlat")
t = Transformer.from_crs(src_crs, tgt_crs)
print(t.transform(0, 0, None, errcheck=True))

Traceback when not using set_ca_bundle_path():

Traceback (most recent call last):
  File "/home/h01/REDACTED/sandbox/2023-01-16_proj_cert.py", line 17, in <module>
    print(t.transform(0, 0, None, errcheck=True))
  File "/tmp/persistent/conda/envs/pyproj_report/lib/python3.10/site-packages/pyproj/transformer.py", line 748, in transform
    self._transformer._transform(
  File "pyproj/_transformer.pyx", line 1146, in pyproj._transformer._Transformer._transform
pyproj.exceptions.ProjError: transform error: Network error when accessing a remote resource: (Internal Proj Error: Cannot open https://cdn.proj.org/uk_os_OSTN15_NTv2_OSGBtoETRS.tif: SSL certificate problem: self signed certificate in certificate chain)

Problem description

Setting PROJ_CURL_CA_BUNDLE does not have the same effect as set_ca_bundle_path(), indeed it doesn't seem to have any effect.

Our specific security has made it necessary to use a local certificate file (located at $SSL_CERT_PATH in the example above) to allow scripts to access online resources. We can successfully point pyproj/PROJ at this certificate file using pyproj.network.set_ca_bundle_path(), but for global user config we would rather use the PROJ_CURL_CA_BUNDLE environment variable described in the documentation (both pyproj and PROJ docs).

Forcing PROJ_NETWORK=OFF does not seem like the correct solution, since the tools are ostensibly there to allow using a custom certificate.

Expected Output

Setting the PROJ_CURL_CA_BUNDLE environment variable should enable Internet access via the referenced certificate file, in the same way that set_ca_bundle_path() does.

Environment Information

pyproj info:
    pyproj: 3.3.0
      PROJ: 8.2.1
  data dir: /tmp/persistent/conda/envs/pyproj_report/share/proj
user_data_dir: /home/h01/REDACTED/.local/share/proj
PROJ DATA (recommended version): 1.8
PROJ Database: 1.2
EPSG Database: v10.041 [2021-12-03]
ESRI Database: ArcMap 12.8 [2021-05-06]
IGNF Database: 3.1.0 [2019-05-24]

System:
    python: 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0]
executable: /tmp/persistent/conda/envs/pyproj_report/bin/python
   machine: Linux-3.10.0-1160.76.1.el7.x86_64-x86_64-with-glibc2.17

Installation method

mamba create -n pyproj_report pyproj

(Also seen in several other environments that contain pyproj).

Conda environment information (if you installed with conda):


Environment (conda list):

``` $ conda list proj # packages in environment at /tmp/persistent/conda/envs/pyproj_report: # # Name Version Build Channel proj 8.2.1 ha227179_0 conda-main pyproj 3.3.0 py310h162314d_0 conda-main ```


Details about conda and system ( conda info ):

``` $ conda info active environment : pyproj_report active env location : /tmp/persistent/conda/envs/pyproj_report shell level : 2 user config file : /home/h01/REDACTED/.condarc populated config files : /etc/conda/condarc /home/h01/REDACTED/.condarc conda version : 4.12.0 conda-build version : not installed python version : 3.10.4.final.0 virtual packages : __linux=3.10.0=0 __glibc=2.17=0 __unix=0=0 __archspec=1=x86_64 base environment : /opt/conda (read only) conda av data dir : /opt/conda/etc/conda conda av metadata url : None channel URLs : REDACTED package cache : /tmp/persistent/conda/pkgs /home/h01/REDACTED/.conda/pkgs envs directories : /tmp/persistent/conda/envs /home/h01/REDACTED/.conda/envs REDACTED /opt/conda/envs platform : linux-64 user-agent : conda/4.12.0 requests/2.27.1 CPython/3.10.4 Linux/3.10.0-1160.76.1.el7.x86_64 rhel/7.9 glibc/2.17 UID:GID : 11934:1000 netrc file : /home/h01/REDACTED/.netrc offline mode : False ```
trexfeathers commented 1 year ago

Thanks for the pyproj package, and thanks for any help you can provide 🙂

snowman2 commented 1 year ago

Side note: You need to set the environment variables before importing pyproj.

The logic is here if you would like to try to debug the issue on your environment.

snowman2 commented 1 year ago

The CA Bundle path is set when importing pyproj here.

trexfeathers commented 1 year ago

Thanks @snowman2, I've updated the example to remove this ambiguity. As mentioned the problem also exists when the environment variable is set at the command line.

trexfeathers commented 1 year ago

CORRECTION: after some confusion with IDE's and the PROJ_LIB environment variable, this appears limited to specific types of environment we have on-premise. I will close this issue and re-open if I can give better replication instructions in future.

brynpickering commented 11 months ago

I am seeing this same problem on a machine that needs a specific CA bundle available to access remote resources.

I have set PROJ_CURL_CA_BUNDLE and CURL_CA_BUNDLE on my machine to point at the bundle, but I get the bug raised in #705 unless I specifically set the bundle using pyproj.network.set_ca_bundle_path("/path/to/ca-bundle.crt").

Failing MWE

from pyproj.network import set_ca_bundle_path, set_network_enabled
from pyproj.transformer import Transformer

set_network_enabled(True)
set_ca_bundle_path()
t = Transformer.from_crs("epsg:27700", "epsg:4326")
print(t.transform(0, 0, None, errcheck=True))

Fails with:

Traceback (most recent call last):
  File "/path/to/script.py", line 7, in <module>
    print(t.transform(0, 0, None, errcheck=True))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/conda/env/lib/python3.11/site-packages/pyproj/transformer.py", line 820, in transform
    return self._transformer._transform_point(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyproj/_transformer.pyx", line 813, in pyproj._transformer._Transformer._transform_point
pyproj.exceptions.ProjError: transform error: Network error when accessing a remote resource: (Internal Proj Error: Cannot open https://cdn.proj.org/uk_os_OSTN15_NTv2_OSGBtoETRS.tif: SSL certificate problem: unable to get local issuer certificate)

Passing MWE:

from pyproj.network import set_ca_bundle_path, set_network_enabled
from pyproj.transformer import Transformer

set_network_enabled(True)
set_ca_bundle_path("/path/to/ca-bundle.crt")
t = Transformer.from_crs("epsg:27700", "epsg:4326")
print(t.transform(0, 0, None, errcheck=True))

output:

(49.76680723514262, -7.55715980690519)

proj: 9.3.0 pyproj: 3.6.1 python: 3..11.6

brynpickering commented 11 months ago

It looks like the way in which the CA bundle path is initialised in pyproj bulldozes any environment variables pointing to that bundle (see https://github.com/OSGeo/PROJ/issues/3977). The easiest fix would be to remove this line you pointed to previously (@snowman2) as all it is doing is removing all possibility of finding a CA bundle.