OCHA-DAP / ocha-anticipy

Python package to support the development of anticipatory action frameworks
https://github.com/OCHA-DAP/ocha-anticipy
GNU General Public License v3.0
7 stars 1 forks source link

IRI `.download()` error #209

Open zackarno opened 12 months ago

zackarno commented 12 months ago

Im probably doing something wrong here, but not sure why I can't get the example from the documentation to work when I change the country_config iso3 to "som"

Heres the code:

>>> from ochanticipy import create_country_config, CodAB, GeoBoundingBox, \
...                       IriForecastDominant, IriForecastProb
>>> 
>>> country_config = create_country_config(iso3="som")
>>> codab = CodAB(country_config=country_config)
>>> codab.download()
PosixPath('/Users/zackarno/Library/CloudStorage/GoogleDrive-Zachary.arno@humdata.org/Shared drives/Predictive Analytics/CERF Anticipatory Action/General - All AA projects/Data/public/raw/som/cod_ab')
>>> admin0 = codab.load(admin_level=0)
>>> geo_bounding_box = GeoBoundingBox.from_shape(admin0)
>>> iri_prob = IriForecastProb(country_config=country_config, geo_bounding_box=geo_bounding_box)

error comes from here:

>>> iri_prob.download()
Traceback (most recent call last):
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1092, in _validate_conn
    conn.connect()
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 642, in connect
    sock_and_verified = _ssl_wrap_socket_and_match_hostname(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 783, in _ssl_wrap_socket_and_match_hostname
    ssl_sock = ssl_wrap_socket(
               ^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 469, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 513, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/.pyenv/versions/3.11.2/lib/python3.11/ssl.py", line 517, in wrap_socket
    return self.sslsocket_class._create(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/.pyenv/versions/3.11.2/lib/python3.11/ssl.py", line 1075, in _create
    self.do_handshake()
  File "/Users/zackarno/.pyenv/versions/3.11.2/lib/python3.11/ssl.py", line 1346, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:992)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 790, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 491, in _make_request
    raise new_e
urllib3.exceptions.SSLError: [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:992)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 844, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='iridl.ldeo.columbia.edu', port=443): Max retries exceeded with url: /SOURCES/.IRI/.FD/.NMME_Seasonal_Forecast/.Precipitation_ELR/.prob/X/%2840.0%29%2852.0%29RANGEEDGES/Y/%2812.0%29%28-2.0%29RANGEEDGES/data.nc (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:992)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/ochanticipy/datasources/iri/iri_seasonal_forecast.py", line 91, in download
    return self._download(
           ^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/ochanticipy/utils/check_file_existence.py", line 80, in check_file_existence
    return wrapped(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/ochanticipy/datasources/iri/iri_seasonal_forecast.py", line 193, in _download
    response = requests.get(
               ^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zackarno/Documents/CHD/repos/ds-som-2023-risk-analysis-support/.venv/lib/python3.11/site-packages/requests/adapters.py", line 517, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='iridl.ldeo.columbia.edu', port=443): Max retries exceeded with url: /SOURCES/.IRI/.FD/.NMME_Seasonal_Forecast/.Precipitation_ELR/.prob/X/%2840.0%29%2852.0%29RANGEEDGES/Y/%2812.0%29%28-2.0%29RANGEEDGES/data.nc (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:992)')))

I get the same error when trying any other country config (HRP) where we dont already have the data downloaded. Oddly, I was able to download for Nicaragua, El Salvador, Honduras, Guatemala all together with the below mixed python-R work flow (but this was not reproducible for SOM)


# R code
library(sf)
library(rnaturalearth)
library(tidyverse)
library(reticulate)

aoi_countries <- ne_countries(country = c("Nicaragua",
                                          "Honduras",
                                          "Guatemala",
                                          "El Salvador")) %>% 
  st_as_sf() %>% 
  select(
    contains("admin"),
    iso_a3
  )

aoi_bbox <-  st_bbox(aoi_countries) %>% 
  st_as_sfc()

# make bbox available to python
aoi_bbox_py <- reticulate::r_to_py(aoi_bbox)
# python code
import pandas as pd
import geopandas as gpd
from shapely.geometry import Polygon
from ochanticipy import GeoBoundingBox
from ochanticipy import IriForecastDominant
from ochanticipy import create_custom_country_config
from src import utils, constants # these are the source functions from the bfa repo
from ochanticipy import utils.raser
from ochanticipy import IriForecastProb

# read in bbox made w/ R/reticulate
coords_array = r.aoi_bbox_py
coords_array_unnested = coords_array[0][0]
aoi_poly = Polygon(coords_array_unnested)

gdf_aoi_poly = gpd.GeoDataFrame({'geometry':[aoi_poly]})

# python

# this yaml file is basically just blank - but at the top says "iso3=lac"

fp = "lac.yaml"
country_config = create_custom_country_config(fp)

geo_bounding_box=GeoBoundingBox.from_shape(gdf_aoi_poly)

iri_dominant = IriForecastDominant(country_config=country_config,
                                   geo_bounding_box=geo_bounding_box)
iri_dominant.download()
iri_dominant.process()

iri_prob = IriForecastProb(country_config=country_config,
                           geo_bounding_box=geo_bounding_box)

iri_prob.download()
iri_prob.process()
iri_prob_data = iri_prob.load()
turnerm commented 12 months ago

A shoot, good find Zach. Interesting that it works with other coordinates. Can you share the URL that it generates for LAC? Or at least the geo bounding box coordinates. I tried to reproduce a successful download but I'm unable to.

My initial thought was that it's something with the authentication (like maybe our key is expired), because I get the same error when I use a bogus key. But if it works for other locations then maybe that points to something on their end?

zackarno commented 12 months ago

so you do get the same error when running the example from the documentation but w/ "som" ? I also tried w/ "bdi" and had the same issue.

Not sure about the state of the authentication key, but I agree it would be odd that it seems to work w/ the modified workflow (supplying a bbox of Central America and a dummy yaml).

So here is what the bounding box created printed in R looks like:

     xmin      ymin      xmax      ymax 
-92.22925  10.72684 -83.14722  17.81933 

and here it is as an array for python:

array([[-92.22924862,  10.7268391 ],
       [-83.147219  ,  10.7268391 ],
       [-83.147219  ,  17.81932608],
       [-92.22924862,  17.81932608],
       [-92.22924862,  10.7268391 ]])

I'd think you can just plug it into the workflow above like this?

import numpy as np
aoi_poly = Polygon(np.array([[-92.22924862,  10.7268391 ],
       [-83.147219  ,  10.7268391 ],
       [-83.147219  ,  17.81932608],
       [-92.22924862,  17.81932608],
       [-92.22924862,  10.7268391 ]]))

gdf_aoi_poly = gpd.GeoDataFrame({'geometry':[aoi_poly]})

this does work for me. I think unless you remove the LAC IRI forecast .nc file from the AA_DATA_DIR it won't actually download again? but if i remove it... it does work and re-downloads.

turnerm commented 12 months ago

@zackarno does this work for you? It does NOT for me:

import os

from ochanticipy import (
    create_country_config,
    GeoBoundingBox,
    IriForecastProb,
)

os.environ["OAP_DATA_DIR"] = "/tmp"

# The country used shouldn't matter because nothing in the 
# config file is used for IRI
country_config = create_country_config(iso3="som")
geo_bounding_box = GeoBoundingBox(
    lat_max=17.8, lat_min=10.7, lon_max=--83.1, lon_min=-92.2
)
iri_prob = IriForecastProb(
    country_config=country_config, geo_bounding_box=geo_bounding_box
)
iri_prob.download()
zackarno commented 12 months ago

that worked for me!

turnerm commented 12 months ago

Can you try with clobber=True in download(), but even better try this:

import os
from pathlib import Path
import requests

from ochanticipy import (
    create_country_config,
    GeoBoundingBox,
    IriForecastProb,
)

os.environ["OAP_DATA_DIR"] = str(Path("/tmp"))

country_config = create_country_config(iso3="som")
geo_bounding_box = GeoBoundingBox(
    lat_max=17.8, lat_min=10.7, lon_max=--83.1, lon_min=-92.2
)
iri_prob = IriForecastProb(
    country_config=country_config, geo_bounding_box=geo_bounding_box
)
url = iri_prob._get_url()

response = requests.get(
    url,
    # have to authenticate by using a cookie
    cookies={"__dlauth_id": os.getenv("IRI_AUTH")},
)
turnerm commented 12 months ago

At the end it's a Python version issue. Works for versions <3.10:

https://stackoverflow.com/a/73230534

Wil try to fix it when I can, but in the meantime a (less than ideal) workaround is to use Python 3.9 for downloading IRI.

caldwellst commented 10 months ago

Note that I was able to successfully run this on 3.11.2 recently, just when I was blindly running some code for a review: exploration/iri.md.

t-downing commented 10 months ago

Note that I was able to successfully run this on 3.11.2 recently, just when I was blindly running some code for a review: exploration/iri.md.

Did it actually download though? Because all the data would've already been there from when I ran it. When actually downloading with clobber=True, it didn't work for me with 3.11.4 so I just made a separate 3.9 environment just for downloading.

caldwellst commented 10 months ago

Ah yeah, that's probably what happened, forgot clobber = False.

t-downing commented 10 months ago

Note that this also happens with CHIRPS, which makes sense because that is also coming from IRI