issues
search
exasol
/
sagemaker-extension
An Exasol extension to interact with AWS SageMaker from inside the database
MIT License
2
stars
1
forks
source link
Add Lua Function for a Lua Script to export a query to S3 as CSV files
#1
Closed
tkilias
closed
3 years ago
tkilias
commented
3 years ago
Background
We want to train models with SageMaker
SageMaker expects the data in a S3 bucket
Exasol can export the data to S3 as CSV files via the Export commands
We need a Lua Function for a Lua Script which exports a SQL query to S3
https://docs.exasol.com/7.1/database_concepts/scripting.htm
Lua Scripts support the exasol extension called pquery which allows them to run queries in the same transaction as the script got called
https://docs.exasol.com/7.1/database_concepts/scripting/db_interaction.htm
Tips
This project will be a mixed Lua/Python project. However, I recommend to set it up as a python project with poetry as the build tool and dependency manager. For the Lua parts I recommend to follow the blog articles in our community
https://community.exasol.com/t5/database-features/exasol-loves-lua-part-1-how-to-use-eclipse-ide-support-for/ta-p/752
There should be also a Lua plugin for Pycharm if you prefer that
Write the Lua function in a way that you inject pquery into it, to allow local development and testing with a mock
See here to bundle Lua modules
https://community.exasol.com/t5/database-features/exasol-loves-lua-part-3-handling-modules/ta-p/2134
For integration tests you need to use
https://github.com/exasol/terraform-aws-exasol-test-setup
and for continuous integration you need
https://github.com/exasol/ci-isolation-aws
Use pytest with pyexasol for integration tests
[
https://github.com/exasol/data-science-examples/blob/e85ce663a0474c60bd8d8700c9ae60c05c38f1e3/tutorials/machine-learning/python/sagemaker/sagemaker_autopilot.ipynb](Check
here to see how you need to export the data for SageMaker Autopilot)
Acceptance Criteria
We have a Lua function which exports a query in parallel as CSV files
We have unit tests with a Mock for pquery
We have integration tests
Background
Tips
Acceptance Criteria