Closed giordano closed 1 month ago
I would love to move to package extensions. I've read about them and watched a video, but I haven't had the time and depth of understanding to do it yet.
I think adjusting the API would be relatively straightforward w @collect
and connect
being the two that pieces that need to be broken down and portioned off into extensions. All of the the get_table_metadata
functions are already specific to their backend so those would be easy to just move into an extension.
Quick update,
I've figured out how extensions work and have been able to make separate ones for Postgres, SQLite, Athena, GBQ, MySQL, MsSQL, Clickhouse. This had reduced dependencies from 16 to 9.
It seems using the underlying sql mode is sufficient to allow for different collect
dispatches.
I think for now, I will plan to leave databricks and snowflake in main module because it simplifies the collecting asepect a bit and maintains a little more flexibility for using multiple backends in one session.
The only only friction point I see might be collecting from certain backend combinations in the same session like gbq and then aws for example.. but that seems unlikely.
This package has loads of dependencies, on many database backend packages. Would it be possible to switch to using package extensions instead? You'd probably need to slightly change the API though, to introduce new types to dispatch the functions for the different backends.