OCHA-DAP / ocha-anticipy

Python package to support the development of anticipatory action frameworks
https://github.com/OCHA-DAP/ocha-anticipy
GNU General Public License v3.0
8 stars 1 forks source link

IRI `.download()` error #209

Open zackarno opened 1 year ago

zackarno commented 1 year ago

Im probably doing something wrong here, but not sure why I can't get the example from the documentation to work when I change the country_config iso3 to "som"

Heres the code:

>>> from ochanticipy import create_country_config, CodAB, GeoBoundingBox, \
...                       IriForecastDominant, IriForecastProb
>>> 
>>> country_config = create_country_config(iso3="som")
>>> codab = CodAB(country_config=country_config)
>>> codab.download()
PosixPath('/Users/zackarno/Library/CloudStorage/GoogleDrive-Zachary.arno@humdata.org/Shared drives/Predictive Analytics/CERF Anticipatory Action/General - All AA projects/Data/public/raw/som/cod_ab')
>>> admin0 = codab.load(admin_level=0)
>>> geo_bounding_box = GeoBoundingBox.from_shape(admin0)
>>> iri_prob = IriForecastProb(country_config=country_config, geo_bounding_box=geo_bounding_box)

error comes from here:

>>> iri_prob.download()
Traceback (most recent call last):
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1092, in _validate_conn
    conn.connect()
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 642, in connect
    sock_and_verified = _ssl_wrap_socket_and_match_hostname(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 783, in _ssl_wrap_socket_and_match_hostname
    ssl_sock = ssl_wrap_socket(
               ^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 469, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 513, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/.pyenv/versions/3.11.2/lib/python3.11/ssl.py", line 517, in wrap_socket
    return self.sslsocket_class._create(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/.pyenv/versions/3.11.2/lib/python3.11/ssl.py", line 1075, in _create
    self.do_handshake()
  File "/Users/zackarno/.pyenv/versions/3.11.2/lib/python3.11/ssl.py", line 1346, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:992)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 790, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 491, in _make_request
    raise new_e
urllib3.exceptions.SSLError: [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:992)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 844, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='iridl.ldeo.columbia.edu', port=443): Max retries exceeded with url: /SOURCES/.IRI/.FD/.NMME_Seasonal_Forecast/.Precipitation_ELR/.prob/X/%2840.0%29%2852.0%29RANGEEDGES/Y/%2812.0%29%28-2.0%29RANGEEDGES/data.nc (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:992)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/ochanticipy/datasources/iri/iri_seasonal_forecast.py", line 91, in download
    return self._download(
           ^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/ochanticipy/utils/check_file_existence.py", line 80, in check_file_existence
    return wrapped(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/ochanticipy/datasources/iri/iri_seasonal_forecast.py", line 193, in _download
    response = requests.get(
               ^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/adapters.py", line 517, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='iridl.ldeo.columbia.edu', port=443): Max retries exceeded with url: /SOURCES/.IRI/.FD/.NMME_Seasonal_Forecast/.Precipitation_ELR/.prob/X/%2840.0%29%2852.0%29RANGEEDGES/Y/%2812.0%29%28-2.0%29RANGEEDGES/data.nc (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:992)')))

I get the same error when trying any other country config (HRP) where we dont already have the data downloaded. Oddly, I was able to download for Nicaragua, El Salvador, Honduras, Guatemala all together with the below mixed python-R work flow (but this was not reproducible for SOM)


# R code
library(sf)
library(rnaturalearth)
library(tidyverse)
library(reticulate)

aoi_countries <- ne_countries(country = c("Nicaragua",
                                          "Honduras",
                                          "Guatemala",
                                          "El Salvador")) %>% 
  st_as_sf() %>% 
  select(
    contains("admin"),
    iso_a3
  )

aoi_bbox <-  st_bbox(aoi_countries) %>% 
  st_as_sfc()

# make bbox available to python
aoi_bbox_py <- reticulate::r_to_py(aoi_bbox)
# python code
import pandas as pd
import geopandas as gpd
from shapely.geometry import Polygon
from ochanticipy import GeoBoundingBox
from ochanticipy import IriForecastDominant
from ochanticipy import create_custom_country_config
from src import utils, constants # these are the source functions from the bfa repo
from ochanticipy import utils.raser
from ochanticipy import IriForecastProb

# read in bbox made w/ R/reticulate
coords_array = r.aoi_bbox_py
coords_array_unnested = coords_array[0][0]
aoi_poly = Polygon(coords_array_unnested)

gdf_aoi_poly = gpd.GeoDataFrame({'geometry':[aoi_poly]})

# python

# this yaml file is basically just blank - but at the top says "iso3=lac"

fp = "lac.yaml"
country_config = create_custom_country_config(fp)

geo_bounding_box=GeoBoundingBox.from_shape(gdf_aoi_poly)

iri_dominant = IriForecastDominant(country_config=country_config,
                                   geo_bounding_box=geo_bounding_box)
iri_dominant.download()
iri_dominant.process()

iri_prob = IriForecastProb(country_config=country_config,
                           geo_bounding_box=geo_bounding_box)

iri_prob.download()
iri_prob.process()
iri_prob_data = iri_prob.load()
turnerm commented 1 year ago

A shoot, good find Zach. Interesting that it works with other coordinates. Can you share the URL that it generates for LAC? Or at least the geo bounding box coordinates. I tried to reproduce a successful download but I'm unable to.

My initial thought was that it's something with the authentication (like maybe our key is expired), because I get the same error when I use a bogus key. But if it works for other locations then maybe that points to something on their end?

zackarno commented 1 year ago

so you do get the same error when running the example from the documentation but w/ "som" ? I also tried w/ "bdi" and had the same issue.

Not sure about the state of the authentication key, but I agree it would be odd that it seems to work w/ the modified workflow (supplying a bbox of Central America and a dummy yaml).

So here is what the bounding box created printed in R looks like:

     xmin      ymin      xmax      ymax 
-92.22925  10.72684 -83.14722  17.81933 

and here it is as an array for python:

array([[-92.22924862,  10.7268391 ],
       [-83.147219  ,  10.7268391 ],
       [-83.147219  ,  17.81932608],
       [-92.22924862,  17.81932608],
       [-92.22924862,  10.7268391 ]])

I'd think you can just plug it into the workflow above like this?

import numpy as np
aoi_poly = Polygon(np.array([[-92.22924862,  10.7268391 ],
       [-83.147219  ,  10.7268391 ],
       [-83.147219  ,  17.81932608],
       [-92.22924862,  17.81932608],
       [-92.22924862,  10.7268391 ]]))

gdf_aoi_poly = gpd.GeoDataFrame({'geometry':[aoi_poly]})

this does work for me. I think unless you remove the LAC IRI forecast .nc file from the AA_DATA_DIR it won't actually download again? but if i remove it... it does work and re-downloads.

turnerm commented 1 year ago

@zackarno does this work for you? It does NOT for me:

import os

from ochanticipy import (
    create_country_config,
    GeoBoundingBox,
    IriForecastProb,
)

os.environ["OAP_DATA_DIR"] = "/tmp"

# The country used shouldn't matter because nothing in the 
# config file is used for IRI
country_config = create_country_config(iso3="som")
geo_bounding_box = GeoBoundingBox(
    lat_max=17.8, lat_min=10.7, lon_max=--83.1, lon_min=-92.2
)
iri_prob = IriForecastProb(
    country_config=country_config, geo_bounding_box=geo_bounding_box
)
iri_prob.download()
zackarno commented 1 year ago

that worked for me!

turnerm commented 1 year ago

Can you try with clobber=True in download(), but even better try this:

import os
from pathlib import Path
import requests

from ochanticipy import (
    create_country_config,
    GeoBoundingBox,
    IriForecastProb,
)

os.environ["OAP_DATA_DIR"] = str(Path("/tmp"))

country_config = create_country_config(iso3="som")
geo_bounding_box = GeoBoundingBox(
    lat_max=17.8, lat_min=10.7, lon_max=--83.1, lon_min=-92.2
)
iri_prob = IriForecastProb(
    country_config=country_config, geo_bounding_box=geo_bounding_box
)
url = iri_prob._get_url()

response = requests.get(
    url,
    # have to authenticate by using a cookie
    cookies={"__dlauth_id": os.getenv("IRI_AUTH")},
)
turnerm commented 1 year ago

At the end it's a Python version issue. Works for versions <3.10:

https://stackoverflow.com/a/73230534

Wil try to fix it when I can, but in the meantime a (less than ideal) workaround is to use Python 3.9 for downloading IRI.

caldwellst commented 1 year ago

Note that I was able to successfully run this on 3.11.2 recently, just when I was blindly running some code for a review: exploration/iri.md.

t-downing commented 1 year ago

Note that I was able to successfully run this on 3.11.2 recently, just when I was blindly running some code for a review: exploration/iri.md.

Did it actually download though? Because all the data would've already been there from when I ran it. When actually downloading with clobber=True, it didn't work for me with 3.11.4 so I just made a separate 3.9 environment just for downloading.

caldwellst commented 1 year ago

Ah yeah, that's probably what happened, forgot clobber = False.

t-downing commented 1 year ago

Note that this also happens with CHIRPS, which makes sense because that is also coming from IRI