Allow users to run bulk ingest to load large volumes of data into a graph

The ingest should be carried out by lambdas which can run spark-submit jobs to the Kubernetes cluster. These lambdas should initially be developed outside of Kai and referenced via their ARN. The admins of Kai needs some way of adding ingest lambdas to the deployment. The easiest way I can think to do this is with configuration. You could do it via REST but that would require a new user pool etc.

The ingest objects should be stored in DynamoDB and should have the rough structure:

{
    "name": "My Ingest Job",
    "arn": "lambda arn",
    "arguments": {
        "inputFile": "text",
        "generatorJson": "json"
    }
}

A Kai user should be able to retrieve these objects (minus the arn) and a UI should be able to use the arguments and their types to render a form that the user can fill in to trigger a bulk ingest.

gchq / Kai

Allow users to run bulk ingest to load large volumes of data into a graph #51