awslabs / aws-lambda-redshift-loader

Amazon Redshift Database Loader implemented in AWS Lambda
Other
597 stars 164 forks source link

Better error logging in batches #50

Open endario opened 9 years ago

endario commented 9 years ago

This is more a feature request than issue report:

As operations monitoring the Lambda-based ETL process, I'd like the failed batches that are stored in DynamoDb to include more diagnostic information.

We had some loading errors in a newly set up table, and the information available in the batches are just:

{
    "xxxx.redshift.amazonaws.com": {
        "error": {
            "code": "0A000",
            "file": "/home/awsrsqa/padb/src/pg/src/backend/commands/commands_copy.c",
            "length": 138,
            "line": "2168",
            "name": "error",
            "routine": "DoCopy",
            "severity": "ERROR"
        },
        "status": -1
    }
}

We did manage to eventually find the corresponding log stream and diagnose it further, which was caused by a wrong copy command (which I suspect is yet another issue on its own), and quite a lot of more detailed error logs were actually available.

So here I wonder, would it be possible to include more details in the failed batches, so we can easily query it with just 'queryBatches.js'?

IanMeyers commented 9 years ago

This will require querying the STL_LOAD_ERRORS after a failed load. I'm open to this, but will be some time until I can build it. Happy to take a pull request that adds this.