dagster-io / hooli-data-eng-pipelines

Example Dagster Cloud code for the Hooli Data Engineering organization.
72 stars 15 forks source link

use DbtProject in hooli_data_eng #78

Closed alangenfeld closed 4 months ago

alangenfeld commented 5 months ago

expected changes to hooli data eng to move from DbtArtifacts to DbtProject and use the new state handling features

alangenfeld commented 5 months ago

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @alangenfeld and the rest of your teammates on Graphite Graphite

github-actions[bot] commented 5 months ago

Your pull request is automatically being deployed to Dagster Cloud.

Location Status Link Updated
batch_enrichment Building... Apr 15, 2024 at 07:35 PM (UTC)
data-eng-pipeline Building... Apr 15, 2024 at 07:35 PM (UTC)
snowflake_insights Building... Apr 15, 2024 at 07:35 PM (UTC)
basics Building... Apr 15, 2024 at 07:35 PM (UTC)
demo_assets Building... Apr 15, 2024 at 07:35 PM (UTC)
cnolanminich commented 4 months ago

@alangenfeld this looks awesome! thanks for putting it together. I added back the make deps because the dbt packages still need to be built. Once I added that in, getting this error in the dbt parse

Compilation Error invalid syntax for function call expression line 6 {%- set resolved = ref(*_ref_args, v=_ref.get('version')) -%}

That comes from here in Dagster but I'm not sure why it's failing on this CI -- any ideas?

alangenfeld commented 4 months ago

those lines have some blame from @rexledesma thats only a few weeks old, hopefully he can help discern

rexledesma commented 4 months ago

@cnolanminich what you linked shouldn't be relevant here. It's not part of the stack trace.

The error is happening in dbt parse, which doesn't call framework code. It just compiles the dbt project.

cnolanminich commented 4 months ago

Ok, I'm just not sure what the parse is failing. I pulled down the changes locally and installed the latest versions of the libraries, and the dbt parse on the project works fine, and I don't see anywhere in the dbt project where the code is getting called.

I'm a little lost about how to reproduce why the parse error is happening in GitHub actions but not locally, @rexledesma any ideas?

rexledesma commented 4 months ago

There's some dependency thrash happening in the build:

ERROR: launchpadlib 1.10.13 requires testresources, which is not installed.
ERROR: dbt-semantic-interfaces 0.4.4 has requirement jinja2~=3.0, but you'll have jinja2 2.10.1 which is incompatible.
ERROR: dbt-semantic-interfaces 0.4.4 has requirement jsonschema~=4.0, but you'll have jsonschema 3.2.0 which is incompatible.
ERROR: dbt-core 1.7.11 has requirement Jinja2<4,>=3.1.3, but you'll have jinja2 2.10.1 which is incompatible.
ERROR: importlib-resources 6.4.0 has requirement zipp>=3.1.0; python_version < "3.10", but you'll have zipp 1.0.0 which is incompatible.
ERROR: github3-py 4.0.1 has requirement pyjwt[crypto]>=2.3.0, but you'll have pyjwt 1.7.1 which is incompatible.
alangenfeld commented 4 months ago

make deps needs to be updated to not point at the config dir which we moved profiles.yml out of I see

dbt deps --project-dir=dbt_project --profiles-dir=dbt_project/config

in the logs - not sure if thats actively a problem

github-actions[bot] commented 4 months ago

Your pull request at commit d0b580b2fb30890b410a3736ecb64cb502646b04 is automatically being deployed to Dagster Cloud.

Location Status Link Updated
batch_enrichment Building... Apr 15, 2024 at 07:39 PM (UTC)
data-eng-pipeline Building... Apr 15, 2024 at 07:39 PM (UTC)
snowflake_insights Building... Apr 15, 2024 at 07:39 PM (UTC)
basics Building... Apr 15, 2024 at 07:39 PM (UTC)
demo_assets Building... Apr 15, 2024 at 07:39 PM (UTC)
alangenfeld commented 4 months ago

ok the last failure here is that the branch build is trying to download artifacts that have not been uploaded yet

I will work on softening that error state, but for now I think the move may be to land this so the merge time upload can happen

github-actions[bot] commented 4 months ago

Your pull request at commit fcfc20420a00e5a7144fe502161d0e53b5f6e9f3 is automatically being deployed to Dagster Cloud.

Location Status Link Updated
batch_enrichment View in Cloud Apr 15, 2024 at 08:15 PM (UTC)
data-eng-pipeline View in Cloud Apr 15, 2024 at 08:15 PM (UTC)
snowflake_insights View in Cloud Apr 15, 2024 at 08:15 PM (UTC)
basics View in Cloud Apr 15, 2024 at 08:15 PM (UTC)
demo_assets View in Cloud Apr 15, 2024 at 08:15 PM (UTC)
github-actions[bot] commented 4 months ago

Your pull request at commit 99d72625ed7c5bc954252cada5807f32dc7df0f3 is automatically being deployed to Dagster Cloud.

Location Status Link Updated
batch_enrichment Building... Apr 15, 2024 at 08:19 PM (UTC)
data-eng-pipeline Building... Apr 15, 2024 at 08:19 PM (UTC)
snowflake_insights Building... Apr 15, 2024 at 08:19 PM (UTC)
basics Building... Apr 15, 2024 at 08:19 PM (UTC)
demo_assets Building... Apr 15, 2024 at 08:19 PM (UTC)