[X] I have checked that this issue has not already been reported.
[X] I have checked that this bug exists on the latest version of HydroMT.
Reproducible Example
Maybe it's two errors in one but somehow even with a local data catalog and local data HydroMT somehow still tries to access the predefined catalogs which are stored online.
Not fully reproducible example but this should pop up when trying to build a wflow model from a shapefile without internet access:
setup_basemaps:
hydrography_fn: merit_hydro # source hydrography data {merit_hydro, merit_hydro_1k}
basin_index_fn: merit_hydro_index # source of basin index corresponding to hydrography_fn
upscale_method: ihu # upscaling method for flow direction data, by default 'ihu'
res: 0.00833 # build the model at a 30 arc sec (~1km) resolution
Local copy of artifact_data or deltares_data should be able to replace the rest.
Current behaviour
2024-09-26 09:28:55,286 - build - log - DEBUG - Writing log messages to new file D:\wflow\Training_11thSep2024_HydroMT\hydromt\wflow_Hikurangi_byXiao\hydromt.log.
2024-09-26 09:28:55,286 - build - log - INFO - HydroMT version: 0.10.0
2024-09-26 09:28:55,287 - build - main - INFO - Building instance of wflow model at D:\wflow\Training_11thSep2024_HydroMT\hydromt\wflow_Hikurangi_byXiao.
2024-09-26 09:28:55,287 - build - main - INFO - User settings:
2024-09-26 09:28:55,333 - build - data_catalog - INFO - Parsing data catalog from ../data/northland_data_extract/data_catalog.yml
2024-09-26 09:28:55,347 - build - model_api - WARNING - Model dir already exists and files might be overwritten: D:\wflow\Training_11thSep2024_HydroMT\hydromt\wflow_Hikurangi_byXiao\staticgeoms.
2024-09-26 09:28:55,356 - build - model_api - WARNING - Model dir already exists and files might be overwritten: D:\wflow\Training_11thSep2024_HydroMT\hydromt\wflow_Hikurangi_byXiao\run_default.
2024-09-26 09:28:55,358 - build - model_api - INFO - Initializing wflow model from hydromt_wflow (v0.6.0).
2024-09-26 09:28:55,358 - build - data_catalog - INFO - Parsing data catalog from C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt_wflow\data\parameters_data.yml
2024-09-26 09:28:55,369 - build - model_api - DEBUG - Setting model config options.
2024-09-26 09:28:55,372 - build - model_api - DEBUG - Default config read from C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt_wflow\data\wflow\wflow_sbm.toml
2024-09-26 09:28:55,372 - build - model_api - INFO - setup_basemaps.region: {'basin': 'HikurangiScope_reproject.shp'}
2024-09-26 09:28:55,372 - build - model_api - INFO - setup_basemaps.hydrography_fn: merit_hydro
2024-09-26 09:28:55,372 - build - model_api - INFO - setup_basemaps.basin_index_fn: merit_hydro_index
2024-09-26 09:28:55,372 - build - model_api - INFO - setup_basemaps.res: 0.0041666
2024-09-26 09:28:55,372 - build - model_api - INFO - setup_basemaps.upscale_method: ihu
2024-09-26 09:28:55,372 - build - wflow - INFO - Preparing base hydrography basemaps.
2024-09-26 09:28:55,375 - build - rasterdataset - INFO - Reading merit_hydro raster data from D:\wflow\Training_11thSep2024_HydroMT\data\northland_data_extract\merit_hydro\{variable}.tif
2024-09-26 09:28:55,531 - build - main - ERROR - HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /Deltares/hydromt/main/data/catalogs/artifact_data/registry.txt (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000002CDCB2A4650>: Failed to resolve 'raw.githubusercontent.com' ([Errno 11004] getaddrinfo failed)"))
Traceback (most recent call last):
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\urllib3\connection.py", line 203, in _new_conn
sock = connection.create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\urllib3\util\connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\socket.py", line 962, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno 11004] getaddrinfo failed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\urllib3\connectionpool.py", line 790, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\urllib3\connectionpool.py", line 491, in _make_request
raise new_e
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\urllib3\connectionpool.py", line 467, in _make_request
self._validate_conn(conn)
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\urllib3\connectionpool.py", line 1096, in _validate_conn
conn.connect()
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\urllib3\connection.py", line 611, in connect
self.sock = sock = self._new_conn()
^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\urllib3\connection.py", line 210, in _new_conn
raise NameResolutionError(self.host, self, e) from e
urllib3.exceptions.NameResolutionError: <urllib3.connection.HTTPSConnection object at 0x000002CDCB2A4650>: Failed to resolve 'raw.githubusercontent.com' ([Errno 11004] getaddrinfo failed)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\requests\adapters.py", line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\urllib3\connectionpool.py", line 844, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\urllib3\util\retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /Deltares/hydromt/main/data/catalogs/artifact_data/registry.txt (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000002CDCB2A4650>: Failed to resolve 'raw.githubusercontent.com' ([Errno 11004] getaddrinfo failed)"))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\cli\main.py", line 224, in build
mod.build(region, opt=opt)
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\models\model_api.py", line 246, in build
self._run_log_method(method, **kwargs)
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\models\model_api.py", line 188, in _run_log_method
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt_wflow\wflow.py", line 232, in setup_basemaps
kind, region = hydromt.workflows.parse_region(region, logger=self.logger)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\workflows\basin_mask.py", line 168, in parse_region
kwarg = _parse_region_value(value0, data_catalog=data_catalog)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\workflows\basin_mask.py", line 206, in _parse_region_value
geom = data_catalog.get_geodataframe(value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\data_catalog.py", line 1375, in get_geodataframe
if str(data_like) in self.sources:
^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\data_catalog.py", line 149, in sources
self.from_predefined_catalogs(self._fallback_lib)
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\data_catalog.py", line 639, in from_predefined_catalogs
catalog_path = self.predefined_catalogs[name].get_catalog_file(version)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\predefined_catalog.py", line 170, in get_catalog_file
version = self.versions[-1]
^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\predefined_catalog.py", line 108, in versions
self._versions = self._get_versions()
^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\predefined_catalog.py", line 122, in _get_versions
keys = self.registry.keys()
^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\predefined_catalog.py", line 94, in registry
return self.pooch.registry
^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\predefined_catalog.py", line 101, in pooch
self._load_registry_file()
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\predefined_catalog.py", line 143, in _load_registry_file
_copyfile(f"{self.base_url}/registry.txt", registry_path)
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\hydromt\data_adapter\caching.py", line 37, in _copyfile
with requests.get(src, stream=True) as r:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\requests\api.py", line 73, in get
return request("get", url, params=params, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\requests\sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\requests\sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\hydromt-wflow\Lib\site-packages\requests\adapters.py", line 519, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /Deltares/hydromt/main/data/catalogs/artifact_data/registry.txt (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000002CDCB2A4650>: Failed to resolve 'raw.githubusercontent.com' ([Errno 11004] getaddrinfo failed)"))
Desired behaviour
If local files are used there should be no need to access the internet.
Additional context
The person was running from China where access to github is not guaranteed. I could be that as long as you have already downloaded and cached the predefined catalogs this solves the issue but worth checking that hydromt does work when fully offline.
HydroMT version checks
Reproducible Example
Maybe it's two errors in one but somehow even with a local data catalog and local data HydroMT somehow still tries to access the predefined catalogs which are stored online.
Not fully reproducible example but this should pop up when trying to build a wflow model from a shapefile without internet access:
The config file
Local copy of artifact_data or deltares_data should be able to replace the rest.
Current behaviour
Desired behaviour
If local files are used there should be no need to access the internet.
Additional context
The person was running from China where access to github is not guaranteed. I could be that as long as you have already downloaded and cached the predefined catalogs this solves the issue but worth checking that hydromt does work when fully offline.