Open jtcohen6 opened 3 years ago
shout out to all our custom adapter developers & maintainers: @jldlaughlin @mikaelene @tglunde @techindicium @silentsokolov @findinpath @dbeatty10 @codeforkjeff @fabrice-etanchaud @kaysef @Tomme @aurany
If y'all have any particular pain points about the adapter creation process, this would be a good forum to do so!
I had a rather easy job, because I built dbt-trino on top of dbt-presto adapter.
I just went over https://docs.getdbt.com/docs/contributing/building-a-new-adapter site.
I think it would be a nice addition to provide a small playground project with multiple branches where each of the concepts described in the site is implemented for a simple adapter. Going from one step to the next (by switching the branches) gives a good clue about the actual sequence of steps to implement in order to provide the basic functionality for a new adapter.
Getting together all the available community adapters (with links to their project code) of dbt may be helpful. If I have a problem that I don't know how to solve, maybe I can see in another available dbt adapter how they tackled eventually my problem. https://github.com/Tomme/dbt-athena is the dbt adapter for AWS Athena, but it is not listed under the available adapters.
Another helpful thing for dbt
Definitely agree with the need for improved steps 1, 3, 4. I don't know that 2 is really necessary: if you're writing an adapter, you probably already are familiar with the basics of dbt.
Some random feedback:
The pytest output for failures is very difficult to decipher. I usually have to guess.
Some documentation to clarify the relationship between python adapter methods and jinja macros would be helpful. I've spent a lot of time trying to understand this and I'm still not sure I have it 100% down. Many adapter methods simply call macros of the same name; in some cases, there's a bit of wrapper code around the macro invocation to do some other stuff.
When writing an adapter, it may make sense to override the adapter method (bypassing the macro), instead of overriding the macro. This is true for stuff in dbt-sqlite, but it's definitely a weird case.
EDIT: I see that the above has already been implemented on the docs website; it's been a while since I've looked at it and didn't even realize there's a whole new set of pages for how to build an adapter. Very nicely done!
I don't think there's currently a test to exercise "docs generate." dbt-sqlite was passing the test suite but a user reported a crash when running "docs generate" because there was a bug in my override of the _get_one_catalog
method, which is probably unusual for an adapter. Still, I think it's a code path worth exercising.
While trying (unsuccessfully) to get the bottom of the hangup described in #24, @swanderz and colleagues provided some great feedback about how these tests could better aid with the experience of creating, publishing, and maintaining an adapter.
Problems
UX
I believe it's helpful to separate the experience of creating a new adapter into four separate steps:
Bootstrap working adapter: Implement just enough properties and methods on top of the base adapter to get something usable. While the
create_adapter_plugins.py
script is a nice shortcut to the right file structure, we're missing:include/global_project
connections.py
working, such that dbt and my database speak SQL to each other?Learning core concepts: Now that the adapter has been bootstrapped, and is usable at a basic level, use it alongside a toy project (e.g.
jaffle_shop
) to learn about basic dbt operations. What's a model, what's a test, what are the most common tasks (run
,test
)? What's the actual SQL that dbt is sending for each of these? An experienced dbt-core user can skip this step, but someone brand new to dbt should expect to spend some time here.Check for functional parity: dbt Labs should maintain a suite of integration tests that check for lowest-common-denominator functionality. Once an adapter passes all of those tests (or has a good reason not to), we can feel good about publishing it and kicking off a beta. (This step is closest to the aim of the current test suite, but the current suite makes it very hard to debug or introspect failures.)
Extending the test framework: There are always going to be adapter-specific functional pieces that won't be covered by the generic suite. Additionally, once an adapter is published, there will be bugs and community contributions that warrant additional testing. It's crucial that the Labs-supported test suite exposes enough foundational constructs to make it easy to write totally new custom integration tests (similar to what we've done in dbt-spark). The interfaces for writing those tests should feel similar to the interfaces when implementing/overriding the suite's built-in tests. We should not optimize for fewest/subtlest lines of code. Instead, we should leverage well-established pytest constructs, and aim for clarity and transparency, even at the cost of verbosity.