RetailMeNotSandbox / dart

Self-service data workflow management
MIT License
17 stars 12 forks source link

Create a AWS Elasticsearch engine #111

Closed geota closed 7 years ago

geota commented 7 years ago

Create a basic Elasticsearch engine to expose some of elasticsearch-py commands. Original use case was to allow users of Dart to be able to create a data check against Elasticsearch that can be used to impact workflow direction (i.e. reload data when a data check fails or notify affected stakeholders).

        ElasticsearchActionTypes.data_check - execute an Elasticsearch query that must return at least one document for the action to succeed. 
        ElasticsearchActionTypes.create_index - create an index
        ElasticsearchActionTypes.create_mapping - create a mapping for an existing index or indices
        ElasticsearchActionTypes.create_template - create a template
        ElasticsearchActionTypes.delete_index - delete an index
        ElasticsearchActionTypes.delete_template - delete a template
        ElasticsearchActionTypes.force_merge_index - force merge an index (this replaces optimize_index after ES 2.x series)

Since elasticsearch-py is pegged to individual major Elasticsearch versions - this will only support Elasticsearch versions >=5.x.x and <6.0.0. We could take the time to create individual ES engines per ES major version and namespace the engines respectively, but this work is left as a future improvement.

Credentials can either be supplied at the time of Engine instantiation (i.e. datastore creation) or if left blank - the engine will use the instance profile credentials to access the cluster.