analyst-collective / analytics

Open source data models and analysis.
Apache License 2.0
46 stars 11 forks source link

Interface investigation #5

Closed drewbanin closed 8 years ago

drewbanin commented 8 years ago

Investigating interfaces

I wrote a hacky python script as a proof of concept. There is much to expand upon and clean up, so try your best to ignore the code itself. Right now, the script doesn't actually create/drop schemas/views, but that's a one-line change.

The script sorts the queries by filename, so there's 1_base.sql, 2_filter.sql, etc to resolve dependencies. I think we should be able to actually generate and walk a DAG instead (as @cmerrick mentioned) when the time comes.

Interface hints are given as view comments:

COMMENT ON VIEW pardot_analysis.visitoractivity_transformed IS 'timeseries,funnel,cohort';

I think this needs some more thought. Namely, I suspect that visitoractivity_transformed should expose the interface event_stream, and event_stream should expose timeseries,funnel,cohort. But again, we might not be there yet...

Outstanding questions on my end:

Respond!

jthandy commented 8 years ago

Haven't looked at this yet, but wanted to respond to your specific questions first. Here's my take.

Looking through the code now.

jthandy commented 8 years ago

Ok, overall comments on this PR:

Please let me know if you want to discuss this today. We've spent a bunch of time talking about this, but I think we're still not getting detailed enough in determining what exactly next steps look like before you go to write code. I'm going to really focus on fixing this communication problem so that we can iterate more tightly.

drewbanin commented 8 years ago

@jthandy please take another look! cc @cmerrick

cmerrick commented 8 years ago

This looks like a step in the right direction. We need to figure out how to allow the source schema.table for a model to be set at runtime too so these can be reusable for other data sets. We can tackle that next.

cmerrick commented 8 years ago

I just ran into a problem that we're going to have to get around: dropping the schema drops all permissions granted to it, so they would need to be recreated every time. For now switching from a drop if exists+create to a create if not exists will help us avoid this, but in that scenario I believe the create or replace view... commands are going to fail if the schema of the view changes.

jthandy commented 8 years ago

I'm good on this PR once the existing comments have been addressed.

drewbanin commented 8 years ago

@cmerrick @jthandy great feedback, thanks. Will address your collective comments ASAP!

cmerrick commented 8 years ago

@drewbanin I have a few other PRs ready to merge that depend on this branch, lmk if there's anything I can do to help get this merged

drewbanin commented 8 years ago

Don't have much bandwidth this week -- I'll take a crack at it tonight and let you know where I leave off! Worst case, I'll finish and merge Tuesday afternoon

Sent from my iPhone

On Mar 7, 2016, at 4:22 PM, Christopher Merrick notifications@github.com wrote:

@drewbanin I have a few other PRs ready to merge that depend on this branch, lmk if there's anything I can do to help get this merged

— Reply to this email directly or view it on GitHub.

drewbanin commented 8 years ago

@cmerrick @jthandy addressed the changes the two of you mentioned. There are some issues raised w/ no clear solution yet. Let's punt on those for the moment. I'll create any necessary issues as appropriate

merging