snowplow / sql-runner

Run templatable playbooks of SQL scripts in series and parallel on Redshift, PostgreSQL, BigQuery and Snowflake
http://snowplowanalytics.com
Apache License 2.0
81 stars 15 forks source link

Add some try-catch like mechanism for SQL queries #162

Open colmsnowplow opened 4 years ago

colmsnowplow commented 4 years ago

One of the challenges to writing a set of general SQL models is that some of the data we'd like to have models for isn't always present. So we're aiming for a structure that allows one to enable certain things as 'add-ons', if the data is there.

For example, let's say we'd like to have the IAB enrichment available, but that's not a standard feature for everyone. If you don't have it switched on, the table doesn't exist. So any SQL that tries to query the table will fail the job, and SQL doesn't support IF logic to the level that solves this problem.

It feels like we could do some really awesome things if we could write some logic along the lines of:

Not sure how achievable it is, or if there's a better solution to that problem. Probably a 'nice to have', but could be pretty cool/powerful!

alexanderdean commented 4 years ago

You've reinvented PL/SQL, Oracle's imperative language:

https://docs.oracle.com/cd/B28359_01/appdev.111/b28843/tdddg_procedures.htm

Agree this would be a great thing to add, as I don't think any of our data warehouses support this kind of decision-tree capability...

alexanderdean commented 4 years ago

If {table doesn't exist failure}: Continue without attempting that module

I'd prefer to do that though with an explicit test for table existing rather than catching an exception. It's more deterministic.

colmsnowplow commented 4 years ago

I'd prefer to do that though with an explicit test for table existing rather than catching an exception. It's more deterministic.

Yeah for sure I agree.

Agree this would be a great thing to add, as I don't think any of our data warehouses support this kind of decision-tree capability...

So Redshift actually does support procedures: https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_PROCEDURE.html

But I never drew the line between that and this kind of thing (I actually didn't realise that Redshift supported IF statements in procedures until now tbh!). I'll explore if this fits the bill for Redshift...

alexanderdean commented 4 years ago

Looks like BQ and Snowflake also support stored procs...

colmsnowplow commented 4 years ago

Interesting. The missing piece of the puzzle that I need to test then is whether this can handle non-existent resources (or checks for their existence).