Datavault-UK / automate-dv

A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
https://www.automate-dv.com
Apache License 2.0
510 stars 131 forks source link

[BUG] Binary value too long and would be truncated #1

Closed stetou closed 4 years ago

stetou commented 4 years ago

Describe the bug
from the worked example I get errors when creating the Links and the sats. The hubs are fine.

To Reproduce Steps to reproduce the behavior:

  1. Following https://dbtvault.readthedocs.io/en/latest/loading/
  2. dbt run --models tag:link
  3. See my screenshot to see the error

Screenshots If applicable, add screenshots to help explain your problem. image

I'm using dbt 0.15 and revision: dbtVault v0.4 I choose sha as hash, maybe it is related? i guess not since the hubs are ok...

DatavaultUK commented 4 years ago

Thanks for reporting this. dbt has introduced a breaking change in version 0.15. The code should work fine with version 0.14. We're aware of the problem and hope to have a fix in place for 0.15 next week. In the meantime try version 0.14 and let us know if the problem persists. Regards, Neil

DVAlexHiggs commented 4 years ago

Thanks for reporting this. dbt has introduced a breaking change in version 0.15. The code should work fine with version 0.14. We're aware of the problem and hope to have a fix in place for 0.15 next week. In the meantime try version 0.14 and let us know if the problem persists. Regards, Neil

This is correct. We'll have an update for 0.15 soon. For now I recommend running the worked example in a python virtual environment using the provided requirements.txt to prevent disruption to other projects. Thanks for the report!

DVAlexHiggs commented 4 years ago

@stetou Just tried running it with v0.15 of dbt to check our suspicions and it seems to be running normally (see below) I ran the load of the links using the command dbt run --models @tag:link which also runs all models up the dependency chain:

Screenshot from 2019-11-28 22-40-19

The error you are getting actually seems to imply you may have run the link load or staging load previously using MD5 instead, (as it is saying the value is too long) is this the case?

If so, try running dbt run --full-refresh --models @tag:link to run the load again and force dbt to re-create the tables. If for some reason this doesn't work, you can safely drop the DEMO_x schemas and use the following commands in turn:

This will create the raw staging, hashed staging, hub and link layers again and get you back to where you left off.

Hope this helps!

stetou commented 4 years ago

this is what I have in dbt_project vars: date: TO_DATE('1992-01-08') hash: SHA # or MD5

I tried to load the links only once, after the hubs

Le jeu. 28 nov. 2019 à 17:42, Alex Higgs notifications@github.com a écrit :

@stetou https://github.com/stetou Just tried running it with v0.15 of dbt to check our suspicions and it seems to be running normally: [image: Screenshot from 2019-11-28 22-40-19] https://user-images.githubusercontent.com/25080503/69834079-18ade080-1230-11ea-884f-bd97f7ec402d.png

The error you are getting actually seems to imply you may have run the link load previously using MD5 instead, (as it is saying the value is too long) is this the case?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Datavault-UK/dbtvault/issues/1?email_source=notifications&email_token=AANSLYBAFLP5TKVYSE5TPNDQWBCLZA5CNFSM4JSY67Z2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFNQR3Y#issuecomment-559614191, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANSLYBXDJ625MZ562EFHVDQWBCLZANCNFSM4JSY67ZQ .

DVAlexHiggs commented 4 years ago

@stetou

Please take a look at my edited response. I had a few additional thoughts after posting :)

stetou commented 4 years ago

deleted the demo_xxxx schemas and start from scratch, got the same error for the links. I removed dbt .15 and tried to install v0.14, v0.14.3,v.14.4 all with an error related to psycopg2 saying that pg_config is not found. I opened an issue for this, before installing postgres-server on my windows machine cause I don't need it. I'll be back as soon as I have a working installation of dbt v0.14

DVAlexHiggs commented 4 years ago

deleted the demo_xxxx schemas and start from scratch, got the same error for the links. I removed dbt .15 and tried to install v0.14, v0.14.3,v.14.4 all with an error related to psycopg2 saying that pg_config is not found. I opened an issue for this, before installing postgres-server on my windows machine cause I don't need it. I'll be back as soon as I have a working installation of dbt v0.14

Thank you for giving it a try.

I've just managed to re-produce the issue on dbt v0.15 AND v0.14, it's not related to the dbt version.

It only happens when SHA hashing is selected. It is something that we need to fix our end. Thank you again for the report, we'll bring out a fix asap. For now please use MD5 hashing.

DVAlexHiggs commented 4 years ago

@stetou

Found the issue. We forgot to update the link models in the worked example to work with v0.4 of dbtvault. The manual mappings are forcing the columns to use Binary(16) instead of (32), so it will always break when using SHA. Oops! We will release an update to the worked example shortly.

DVAlexHiggs commented 4 years ago

This should now be fixed in the latest release of the snowflakeDemo. Please download the new version and let us know how it goes. https://github.com/Datavault-UK/snowflakeDemo/releases/tag/v2.1 We recommend that you continue to use dbt v0.14.x as dbtvault is not yet tested and updated for v0.15.

DVAlexHiggs commented 4 years ago

deleted the demo_xxxx schemas and start from scratch, got the same error for the links. I removed dbt .15 and tried to install v0.14, v0.14.3,v.14.4 all with an error related to psycopg2 saying that pg_config is not found. I opened an issue for this, before installing postgres-server on my windows machine cause I don't need it. I'll be back as soon as I have a working installation of dbt v0.14

Regarding your psycopg2 issue, take a look at installation notes on the dbt 0.15 release notes. It could be related: https://discourse.getdbt.com/t/release-dbt-v0-15-0/716

stetou commented 4 years ago

thanks for this fix. I tried many things suggested on forum/release note without success. I'll wait for fishtown-analytics to answer.

DVAlexHiggs commented 4 years ago

Thanks for letting us know about the issue. Happy Datavaulting! :)