mrocklin / dask-tutorial

BSD 3-Clause "New" or "Revised" License
17 stars 9 forks source link

problem accessing data from binder #5

Closed dchudz closed 1 year ago

dchudz commented 1 year ago

This is when we're not using the Coiled cluster:

# Read in one year of NYC Taxi data

import dask.dataframe as dd

df = dd.read_parquet(
    "s3://coiled-datasets/dask-book/nyc-tlc/2009"
)
df.head()

Gives us:

...
File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/httpsession.py:238, in AIOHTTPSession.send(self, request)
    236 except ServerTimeoutError as e:
    237     if str(e).lower().startswith('connect'):
--> 238         raise ConnectTimeoutError(endpoint_url=request.url, error=e)
    239     else:
    240         raise ReadTimeoutError(endpoint_url=request.url, error=e)

ConnectTimeoutError: Connect timeout on endpoint URL: "http://169.254.169.254/latest/api/token"
ntabris commented 1 year ago

use this instead

# Read in one year of NYC Taxi data

import dask.dataframe as dd

df = dd.read_parquet(
    "s3://coiled-datasets/dask-book/nyc-tlc/2009",
    storage_options={"anon":True},
)
df.head()

I've confirmed this works on binder

mrocklin commented 1 year ago

Fixed I think