datahub-project / datahub

The Metadata Platform for your Data and AI Stack
https://datahubproject.io
Apache License 2.0
9.94k stars 2.95k forks source link

Problem with preparing tables on MySQL 5.7 #117

Closed rkluszczynski closed 8 years ago

rkluszczynski commented 8 years ago

Hello, first, thanks for open sourcing this project:) I'm now testing this project to see, what it can. I'm creating database structure, but I'm using MySQL 5.7 (this mysql is outsourced). I've got such a errors when executing create_all_tables_wrapper.sql script:

ERROR 1171 (42000) at line 19 in file: 'ETL_DDL/dataset_metadata.sql': All parts of a PRIMARY KEY must be NOT NULL; if you need NULL in a key, use UNIQUE instead

I've removed all problematic DEFAULT NULL, patch file is here: data-model-ddl_patch.txt

My question is: is it safe, or is this DEFAULT NULLs on primary keys is required somewhere in code?

planckiii commented 8 years ago

Additionaly i will ask on which version of mysql DDLs from create_all_tables_wrapper.sql are working OK ?

ericsun2 commented 8 years ago

All the DDL files been deployed in MySql 5.6.x (Oracle or Percona).

I need to setup a 5.7.x instance and test the necessity of the patch provided by @rkluszczynski

rkluszczynski commented 8 years ago

@ericsun2 Thanks for the response. I would apprieciate if you would check, will it work correctly. Hopefully, everything will be OK.

rkluszczynski commented 8 years ago

@ericsun2 Hello, any success in finding time to check mysql 5.7.x?

My colleague has found that in db.wherehows.url jdbc url is needed to add: &sessionVariables=sql_mode='STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'.

FYI: After that changes we have managed to do some feed (with errors so far) from hive metastore.

ericsun2 commented 8 years ago

I have to applied the "NOT NULL" patch for DDL. These options are needed for MySql 5.7.* only probably. I am able to add sql_mode parameter into 5.6.*'s connection string as well.

What kind of errors you encounter while ETL data from Hive?

rkluszczynski commented 8 years ago

@ericsun2 Thanks. I will reset database tables based on new definitions you have provided to check will it work correctly for us:)

In case of errors I will start to investigate them now, so I will let you know.

I first had to fix datasets tree view in web module (please take a look at PR #124). This was "no go" for us in case of deployment and without it it was harder to debug our hive dataset;-)

rkluszczynski commented 8 years ago

@ericsun2 Hello, DDLs work better now for MySQL 5.7.x. Thanks a lot! Only two table could not be created. I have addressed that in #127.

Please take a look at the description and PR and let me know if it should be changed.

rkluszczynski commented 8 years ago

@ericsun2 Thanks for merging #127. Finally, DDLs support version 5.7.x:), so I'm closing this issue. In case of Hive errors I will create new one, but at the moment I'm struggling with them.