crypto-lake / lake-api

Python API for accessing Lake high frequency tick trades & order book data
https://crypto-lake.com/
Apache License 2.0
24 stars 3 forks source link

Free API layer not working as defined in docs #7

Closed xmariachi closed 10 months ago

xmariachi commented 10 months ago

Description

Trying to get data through the Free Data API, with anonymous_access=True, fails to provide the data.

botocore.exceptions.UnauthorizedSSOTokenError: The SSO session associated with this profile has expired or is otherwise invalid. To refresh this SSO session run aws sso login with the corresponding profile.

What I Did

Running this code as suggested in the docs:

import lakeapi
import datetime

def save_currency_rates_crypto_lake(symbols=["BTC-USDT"]):
    lakeapi.use_sample_data(anonymous_access=True)

    df = lakeapi.load_data(
        table="book",
        start=datetime.datetime(2022, 10, 1),
        end=datetime.datetime(2022, 10, 2),
        symbols=symbols,
        exchanges=["BINANCE"],
    )
    print(df)

if __name__ == '__main__':
    save_currency_rates_crypto_lake()

Output:

$ python3 dags/utils/crypto_lake.py 
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 2137, in _get_credentials
    response = client.get_role_credentials(**kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/client.py", line 535, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocache/botocache.py", line 54, in _make_api_call
    return super()._make_api_call(operation_name, api_params)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/client.py", line 983, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.UnauthorizedException: An error occurred (UnauthorizedException) when calling the GetRoleCredentials operation: Session token not found or invalid

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dags/utils/crypto_lake.py", line 18, in <module>
    save_currency_rates_crypto_lake()
  File "dags/utils/crypto_lake.py", line 7, in save_currency_rates_crypto_lake
    df = lakeapi.load_data(
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/main.py", line 161, in load_data
    df = lakeapi._read_parquet.read_parquet(
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/_read_parquet.py", line 611, in read_parquet
    dfs=_read_dfs_from_multiple_paths(
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/_read.py", line 145, in _read_dfs_from_multiple_paths
    kwargs["boto3_session"] = boto3_to_primitives(kwargs["boto3_session"])
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/_utils.py", line 44, in boto3_to_primitives
    "aws_access_key_id": getattr(credentials, "access_key", None),
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 406, in access_key
    self._refresh()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 499, in _refresh
    self._protected_refresh(is_mandatory=is_mandatory_refresh)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 515, in _protected_refresh
    metadata = self._refresh_using()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 662, in fetch_credentials
    return self._get_cached_credentials()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 672, in _get_cached_credentials
    response = self._get_credentials()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 2139, in _get_credentials
    raise UnauthorizedSSOTokenError()
botocore.exceptions.UnauthorizedSSOTokenError: The SSO session associated with this profile has expired or is otherwise invalid. To refresh this SSO session run aws sso login with the corresponding profile.
leftys commented 10 months ago

Hi @xmariachi

it seems you have somehow corrupted local environment. Did you use any AWS service api, awscli, boto3 or botocore? Can you check your ~/.aws and AWS_ prefixed env variables are empty?

If you are an AWS user and have your user configured in ~/.aws or environment vars, you might try using the following:

lakeapi.set_default_bucket('sample.crypto.lake')
lakeapi.is_anonymous_access = True

instead of the use_sample_data(True) call. Or also use_sample_data(False) might help by doing the same as my two lines above.

Let me know if that helped!

xmariachi commented 10 months ago

Thanks! Code is as you can see above. Running it with python3 directly.

--

Seems like reconfiguring to another profile worked - so I can see now the free data.

--

Now I'm trying with paying creds with the nonsample data - it does not work.

Putting the creds in the ~/.aws/credentials file on a profile.

# [default]
[crypto_lake]
aws_access_key_id = xxx
aws_secret_access_key = xxx

.aws/config:

output = json
region = us-west-1

Run aws configure --profile crypto_lake to set this up.

Same code, just removing the lakeapi.use_sample_data(False) line, and removing the .lake_cache directory, it does not work:

Error encountered : The SSO session associated with this profile has expired or is otherwise invalid. To refresh this SSO session run aws sso login with the corresponding profile.. Retrying the same call without cached context.
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 2137, in _get_credentials
    response = client.get_role_credentials(**kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/client.py", line 535, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocache/botocache.py", line 54, in _make_api_call
    return super()._make_api_call(operation_name, api_params)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/client.py", line 983, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.UnauthorizedException: An error occurred (UnauthorizedException) when calling the GetRoleCredentials operation: Session token not found or invalid

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dags/utils/crypto_lake.py", line 32, in <module>
    save_currency_rates_crypto_lake(symbols=["BTC-USDT"])
  File "dags/utils/crypto_lake.py", line 20, in save_currency_rates_crypto_lake
    df = lakeapi.load_data(
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/main.py", line 161, in load_data
    df = lakeapi._read_parquet.read_parquet(
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/_read_parquet.py", line 557, in read_parquet
    paths: List[str] = _path2list(
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/_describe.py", line 30, in _path2list
    paths: List[str] = list_objects(  # type: ignore
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/_describe.py", line 140, in list_objects
    return [path for paths in result_iterator for path in paths]
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/_describe.py", line 140, in <listcomp>
    return [path for paths in result_iterator for path in paths]
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/_describe.py", line 180, in _list_objects
    for page in response_iterator:  # pylint: disable=too-many-nested-blocks
  File "/home/user/.local/lib/python3.8/site-packages/botocore/paginate.py", line 269, in __iter__
    response = self._make_request(current_kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/paginate.py", line 357, in _make_request
    return self._method(**current_kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/client.py", line 535, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocache/botocache.py", line 54, in _make_api_call
    return super()._make_api_call(operation_name, api_params)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/client.py", line 963, in _make_api_call
    http, parsed_response = self._make_request(
  File "/home/user/.local/lib/python3.8/site-packages/botocore/client.py", line 989, in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/endpoint.py", line 119, in make_request
    return self._send_request(request_dict, operation_model)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/endpoint.py", line 198, in _send_request
    request = self.create_request(request_dict, operation_model)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/endpoint.py", line 134, in create_request
    self._event_emitter.emit(
  File "/home/user/.local/lib/python3.8/site-packages/botocore/hooks.py", line 412, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/hooks.py", line 256, in emit
    return self._emit(event_name, kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/hooks.py", line 239, in _emit
    response = handler(**kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/signers.py", line 105, in handler
    return self.sign(operation_name, request)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/signers.py", line 180, in sign
    auth = self.get_auth_instance(**kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/signers.py", line 284, in get_auth_instance
    frozen_credentials = self._credentials.get_frozen_credentials()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 611, in get_frozen_credentials
    self._refresh()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 499, in _refresh
    self._protected_refresh(is_mandatory=is_mandatory_refresh)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 515, in _protected_refresh
    metadata = self._refresh_using()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 662, in fetch_credentials
    return self._get_cached_credentials()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 672, in _get_cached_credentials
    response = self._get_credentials()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 2139, in _get_credentials
    raise UnauthorizedSSOTokenError()
botocore.exceptions.UnauthorizedSSOTokenError: The SSO session associated with this profile has expired or is otherwise invalid. To refresh this SSO session run aws sso login with the corresponding profile.

Using the

lakeapi.set_default_bucket('sample.crypto.lake')
lakeapi.is_anonymous_access = False

does not seem to make a difference.

--

Is there a way to configure the crypto lake from lakeapi? Like a lakeapi.configure(...) where you can pass in the required variables. And so getting a more explanatory config error if there is one. I've tried by seting the api keys as env vars, which should override the credentials in regular boto3, but still does not work.

My expected usage is to run as a scheduled task where the env vars are fed in the Task Definition.

xmariachi commented 10 months ago

Update: I'm able to configure the boto s3 client manually, setting the paying creds to get sample data, but can't get regular data with it - what is the regular, nonsample bucket name to try this boto client? That could do for me - although obv is more desirable to use the regular load_data api.

xmariachi commented 10 months ago

OK, I managed to use them, and I also see you can pass in an optional boto Session as a param. Closing.

leftys commented 10 months ago

Yes, passing the session should work without any other workarounds for paid customers with more AWS accounts. Glad you got it working.

People with more AWS accounts using the free data access is trickier, will try to get this working out of the box in the future too.