This is an unofficial Python SDK for Athena Federation.
The Python SDK makes it easy to create new Amazon Athena Data Source Connectors using Python. It is under active development so the API may change from version to version.
You can see an example implementation that queries Google Sheets using Athena.
build
module install and SDK dependencies.pip install build
pip install -r requirements.txt
python -m build
This will create a file in dist/
: dist/unoffical_athena_federation_sdk-0.0.0-py3-none-any.whl
Copy that file to your example repo and you can include it in your requirements.txt
like so:
unoffical-athena-federation-sdk @ file:///unoffical_athena_federation_sdk-0.0.0-py3-none-any.whl
You can test your Lambda function locally using Lambda Docker images.
First, build our Docker image and run it.
docker build -t local/athena-python-example .
docker run --rm -p 9000:8080 local/athena-python-example
Then, we can execute a sample PingRequest
.
curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{"@type": "PingRequest", "identity": {"id": "UNKNOWN", "principal": "UNKNOWN", "account": "123456789012", "arn": "arn:aws:iam::123456789012:root", "tags": {}, "groups": []}, "catalogName": "athena_python_sdk", "queryId": "1681559a-548b-4771-874c-2aa2ea7c39ab"}'
{"@type": "PingResponse", "catalogName": "athena_python_sdk", "queryId": "1681559a-548b-4771-874c-2aa2ea7c39ab", "sourceType": "athena_python_sdk", "capabilities": 23}
We can also list schemas.
curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{"@type": "ListSchemasRequest", "identity": {"id": "UNKNOWN", "principal": "UNKNOWN", "account": "123456789012", "arn": "arn:aws:iam::123456789012:root", "tags": {}, "groups": []}, "catalogName": "athena_python_sdk", "queryId": "1681559a-548b-4771-874c-2aa2ea7c39ab"}'
{"@type": "ListSchemasResponse", "catalogName": "athena_python_sdk", "schemas": ["sampledb"], "requestType": "LIST_SCHEMAS"}
💁 Please note these are manual instructions until a serverless application can be built.
export SPILL_BUCKET=<BUCKET_NAME>
export AWS_ACCOUNT_ID=123456789012
export AWS_REGION=us-east-1
export IMAGE_TAG=v0.0.1
aws s3 mb ${SPILL_BUCKET}
aws ecr create-repository --repository-name athena_example --image-scanning-configuration scanOnPush=true
docker tag local/athena-python-example ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/athena_example:${IMAGE_TAG}
aws ecr get-login-password | docker login --username AWS --password-stdin ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com
docker push ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/athena_example:${IMAGE_TAG}
Note the Arn
of the role that's returned
aws iam create-role \
--role-name athena-example-execution-role \
--assume-role-policy-document '{"Version": "2012-10-17","Statement": [{ "Effect": "Allow", "Principal": {"Service": "lambda.amazonaws.com"}, "Action": "sts:AssumeRole"}]}'
aws iam attach-role-policy \
--role-name athena-example-execution-role \
--policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
aws iam create-policy --policy-name athena-example-s3-access --policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": ["arn:aws:s3:::'${SPILL_BUCKET}'"]
},
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject"
],
"Resource": ["arn:aws:s3:::'${SPILL_BUCKET}'/*"]
}
]
}'
aws iam attach-role-policy \
--role-name athena-example-execution-role \
--policy-arn arn:aws:iam::${AWS_ACCOUNT_ID}:policy/athena-example-s3-access
aws lambda create-function \
--function-name athena-python-example \
--role arn:aws:iam::${AWS_ACCOUNT_ID}:role/athena-example-execution-role \
--code ImageUri=${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/athena_example:${IMAGE_TAG} \
--environment 'Variables={TARGET_BUCKET=<BUCKET_NAME>}' \
--description "Example Python implementation for Athena Federated Queries" \
--timeout 60 \
--package-type Image
Connect
!If you update the Lambda function, re-run the build and push steps (updating the IMAGE_TAG
variable) and then update the Lambda function:
aws lambda update-function-code \
--function-name athena-python-example \
--image-uri ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/athena_example:${IMAGE_TAG}