aws-solutions / aws-data-lake-solution

A deployable reference implementation intended to address pain points around conceptualizing data lake architectures that automatically configures the core AWS services necessary to easily tag, search, share, and govern specific subsets of data across a business or with other external businesses.
https://aws.amazon.com/solutions/implementations/data-lake-solution/
Apache License 2.0
401 stars 160 forks source link

API calls suddenly unauthorized. #39

Closed stevejonpeters closed 4 years ago

stevejonpeters commented 5 years ago
  1. deployed Data Lake 2.1 with myself as Data Lake Admin
  2. Generate API Access key for myself using the Web Console
  3. Generate the API Secret key for myself using the Console
  4. deployed a custom lambda python function that makes call a rest call using the requests package to Data Lake API Gateway using the above keys ( yes, I calculate the AWS Version 4 signature ). This has worked in the past. My access key on the Web Console matches my access key in the dynamodb table 'data-lake-keys'. my secret key generate in the Web Console does NOT match my secret key listed on my Cognito user. I did see a similar post on non-adfs deployments that says there is something missing between the api and the authorizer function - but that was back in 2018.
stevejonpeters commented 5 years ago

https://forums.aws.amazon.com/thread.jspa?messageID=767716 here is the issue from 2017

stevejonpeters commented 5 years ago

is this a CORS issue? the datalakeweb bucket does not have CORS enabled.

ericquinones commented 5 years ago

We apologize for the inconvenience. We are currently researching the issue on our end. Please continue to monitor the repository for updates.

ericquinones commented 5 years ago

Hi @stevejonpeters - Below is an example of constructing an API call to v2.1.0 of Data Lake that was tested on the Python 3.7 Lambda runtime using requests v2.22.0. The API returns a 200 status code with a JSON payload as the response body.

The commented lines in getSignatureKey are the JavaScript versions of the code from our API documentation.

The access_key and secret_key needed to generate the signing key will both come from the Data Lake console (details here) so there should be no need to compare with Cognito.

Can you please compare the below with how you are constructing the signing key and the API request?

import os, base64, datetime, hashlib, hmac 
import requests

def getSignatureKey(accessKey, secretKey, dateStamp, apiEndpoint):
    # let kDate = crypto.createHmac('sha256', "DATALAKE4" + secretKey);
    # kDate.update(moment().utc().format("YYYYMMDD"));
    kDate = hmac.new(f"DATALAKE4{secretKey}".encode('utf-8'), dateStamp.encode('utf-8'), hashlib.sha256)

    # let kEndpoint = crypto.createHmac('sha256', kDate.digest('base64'));
    # kEndpoint.update(apiEndpoint);
    kEndpoint = hmac.new(base64.b64encode(kDate.digest()), apiEndpoint.encode('utf-8'), hashlib.sha256)

    # let kService = crypto.createHmac('sha256', kEndpoint.digest('base64'));
    # kService.update('datalake');
    kService = hmac.new(base64.b64encode(kEndpoint.digest()), "datalake".encode('utf-8'), hashlib.sha256)

    # let kSigning = crypto.createHmac('sha256', kService.digest('base64'));
    # kSigning.update("datalake4_request");
    kSigning = hmac.new(base64.b64encode(kService.digest()), "datalake4_request".encode('utf-8'), hashlib.sha256)

    # let _signature = kSigning.digest('base64');
    _signature = base64.b64encode(kSigning.digest()).decode('utf-8')

    # let _apiKey = [accessKey, _signature].join(':');
    # let _authKey = Base64.encode(_apiKey);
    _apiKey = ":".join([accessKey, _signature])
    _authKey = base64.b64encode(_apiKey.encode('utf-8')).decode('utf-8')

    # return ['ak', _authKey].join(':');
    return ":".join(["ak", _authKey])

access_key = os.environ.get('KEY')
secret_key = os.environ.get('SECRET')
api_endpoint = os.environ.get('API_ENDPOINT')

sign_key = getSignatureKey(access_key, secret_key, datetime.datetime.utcnow().strftime('%Y%m%d'), api_endpoint)

headers = {'Auth':sign_key}

request_url = f"https://{api_endpoint}/prod/search?term=test"

r = requests.get(request_url, headers=headers)

print(f"Response code: {r.status_code}")
print(r.text)
knihit commented 4 years ago

Hi @stevejonpeters are you still facing this issue. Did the example provided by eric help you to find a solution. If the problem is resolved, can you please close the issue.

georgebearden commented 4 years ago

Closing this issue, but please let us know if you need additional assistance.