meltano / meltano

Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
https://meltano.com/
MIT License
1.85k stars 166 forks source link

feature: support Snowflake as database backend #8761

Open jordan-bulka-sh opened 2 months ago

jordan-bulka-sh commented 2 months ago

Feature scope

CLI (options, error messages, logging, etc.)

Description

I'd like to see Snowflake be an option for a database backend.

edgarrmondragon commented 2 months ago

Hi @jordan-bulka-sh!

I'm curious to know what are the reasons you'd like to use Snowflake as the system db. Is it because you've already provisioned it for analytics purposes and don't want to spin and a new a different database?

Please see this thread: https://discuss.meltano.com/t/16276062/hey-guys-i-was-wondering-if-there-is-any-way-to-save-the-sta#7ef4a878-6165-4b26-841d-cfbaaec0b417.

From there:

Hey guys. I was wondering if there is any way to save the State (State backend) in Snowflake and not in AWS, Azure and co. Has anyone tried that already?
It's not supported at the moment, but I think it'd not be a great choice for storing state since the requirement is basically a key-value store. For currently supported stores, the key is either a file path, an object key or an indexed column value. Since Snowflake doesn't really have indexes, a full-table scan would be performed at least a couple times for every pipeline run. You could still try it our at your own peril, though 😅. 1. Install `snowflake-sqlalchemy` in the same venv as meltano 2. `meltano config meltano set database_uri snowflake://{user}:{password}@{account}` 3. `meltano config list`. This would run db migrations on snowflake, this is the point where things are most likely to break imo 4. Use the default state store, i.e. the system db which now points to Snowflake
great thankyou! alright, yeah i see the concern 😄
FYI I once tried the above steps, and things did indeed break during migrations, as @edgarrmondragon predicted. The root cause was lack of support by `snowflake-sqlalchemy` for certain `sqlalchemy` functions that `alembic` uses (even after I disabled the indexing-related migrations).