dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.72k stars 1.48k forks source link

Sqlmesh integration #21655

Open dbrtly opened 6 months ago

dbrtly commented 6 months ago

What's the use case?

Similar to the existing dbt integration. The integration should register models in a sqlmesh project and assets in a dagster project foend to end visibility.

Ideas of implementation

Similar to dbt.

One generic asset definition that enables a library and interprets the sqlmesh models as dagster assets.

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization

AlaaInflyter commented 6 months ago

Hey there ! Any news on this ?

dbrtly commented 5 months ago

Regarding the design, I would anticipate that the key diff vs dbt is state management. Dbt uses manifest.json, while sqlmesh writes to a sqlmesh schema in your data warehouse or the scheduler's db.

https://sqlmesh.readthedocs.io/en/stable/guides/connections/#overview

My intuition is that we would extend the db_resource to get equivalent asset definitions initially. Agreed?

AlaaInflyter commented 5 months ago

There's also how environments are handled. For example: currently, Dagster's docs recommends using cloned DBs, and then you'd make DBT use the correct db depending on the environment. Since Sqlmesh has a different approach to envs with its plans, there might be need to adjust this recommendation when using sqlmesh?

dbrtly commented 4 months ago

What's the bar required to green light this? I'd like to contribute but reluctant to do try doing it all without guidance.

cmpadden commented 4 months ago

What's the bar required to green light this? I'd like to contribute but reluctant to do try doing it all without guidance.

Hi @dbrtly - excited to see that you're interested in helping build this integration!

We are working internally on improvement our guides around building integrations, but this may be a good opportunity for us to work together and figure out the kinds of questions that may arise in the development process. If you wanted to take an initial stab at the general structure of the integration, using asset decorators similar to the dbt integration, I'd be happy to review. Also, I'll message you on the community Slack for us to work together, and possibly set up some time to meet / pair.

ldnicolasmay commented 3 months ago

@cmpadden @dbrtly Definitely interested in learning more about this as it develops. Without much knowledge of the deep internals of dbt or SQLMesh, I'm not sure if I could contribute much.

But there does seem to be some interest from the SQLMesh side for supporting a Dagster integration. In their words:

We aim to support other schedulers like Dagster and Prefect in the future.

Maybe @izeigerman or @eakmanrq already have thoughts/efforts in this direction.

ghaffarialireza commented 3 months ago

still eagerly waiting

davidgasquez commented 2 months ago

The folks of Open Source Observer (cc @ravenac95) have a work in progress dagster-sqlmesh library! :tada:

cmpadden commented 2 months ago

The folks of Open Source Observer (cc @ravenac95) have a work in progress dagster-sqlmesh library! 🎉

Thanks for sharing that @davidgasquez. Link for those interested:

ravenac95 commented 2 months ago

Thanks for the post @davidgasquez :). It needs quite a bit of fleshing out but our team will be deploying with our integration this week! So expect to see many changes to it in the coming weeks as we get things figured out!

adrianbr commented 1 month ago

you guys might be excited to hear this news

dlt + sqlmesh = generated incremental scaffolding to just get right into business logic

https://dlthub.com/blog/sqlmesh-dlt-handover

looking forward to see all 3 tools together