ccao-data / service-spark-iasworld

Service for extracting tables from the CCAO system-of-record and uploading them to the Data Department's data warehouse
GNU Affero General Public License v3.0
0 stars 0 forks source link

Add logging, exception handling, and AWS client class #6

Closed dfsnow closed 2 months ago

dfsnow commented 2 months ago

This PR adds logging and error handling using the log4j driver pulled from the Spark session context. I chose to use this logger rather than standard Python logging because I want to capture the Spark output and intersperse it with Python logging.

This PR also adds a new AWSClient class that can trigger Glue jobs and upload finished log files to CloudWatch. I refactored the GitHub session class to more closely match the AWS one.

Here's an example log output in CloudWatch.

dfsnow commented 2 months ago

One question I had related to error handling while I was reading this code: A few of the client methods are wrapped in try/except blocks that prevent them from raising an exception, but what happens if a code block that is not protected in this way causes the job to error out before it can upload its logs to CloudWatch? How will we get alerted that the job has failed? I wonder if it's worth wrapping main() in a big try/except block that logs the exception, attempts to ship logs to CloudWatch, and alerts us before raising the exception.

So the way I'd set this up it wasn't possible to use the Spark logger in the except, since it wouldn't exist if the main loop failed.

In the process of refactoring I discovered you totally can just use both the Spark AND Python loggers at the same time. So, I switched over to generic Python logging in lieu of passing around the Spark logger. This gets us:

I think it's a much better design overall, but curious to see what you think. Lots of changes here, so re-requesting review @jeancochrane!