duckdb / duckdb_mysql

https://duckdb.org/docs/extensions/mysql
MIT License
55 stars 13 forks source link

Errors when importing data #24

Closed ulissescappato closed 10 months ago

ulissescappato commented 10 months ago

What happens?

2 errors:

SQL Error: Conversion Error: date field value out of range: "0000-00-00", expected format is (YYYY-MM-DD)

SQL Error: Invalid Input Error: Invalid unicode (byte sequence mismatch) detected in segment statistics update

It appears that the extension that is downloaded is not the latest version. How to download the latest version from GitHub?

To Reproduce

INSTALL mysql; ATTACH 'host=localhost user=root password=0000 port=3306 database=test' AS mysql_test (TYPE mysql_scanner, READ_ONLY); CREATE TABLE users AS FROM mysql_test.users; CREATE TABLE products AS FROM mysql_test.products;

OS:

Windows 11

MySQL Version:

5.7.44

DuckDB Version:

0.9.2

DuckDB Client:

DBeaver

Full Name:

Ulisses

Affiliation:

Developer

Have you tried this on the latest main branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

Mytherin commented 10 months ago

You can download the latest extension using FORCE INSTALL mysql. If errors persist a reproduction with the data set would be much appreciated.

ulissescappato commented 10 months ago

You can download the latest extension using FORCE INSTALL mysql. If errors persist a reproduction with the data set would be much appreciated.

Mark, unfortunately errors continue to occur. I will try to export a sampling of the data. Thanks.

ulissescappato commented 10 months ago

MySQL 5.7.44:

CREATE TABLE `products` (
  `id_product` int NOT NULL AUTO_INCREMENT,
  `date_created` date NOT NULL,
  PRIMARY KEY (`id_product`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

INSERT INTO products (date_created) VALUES ('0000-00-00');

Importing:

SQL Error: Conversion Error: date field value out of range: "0000-00-00", expected format is (YYYY-MM-DD)


CREATE TABLE `users` (
  `id_user` int NOT NULL AUTO_INCREMENT,
  `name` varchar(45) NOT NULL,
  PRIMARY KEY (`id_user`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

INSERT INTO users (name) VALUES ('Graça');

Importing:

SQL Error: Invalid Input Error: Invalid unicode (byte sequence mismatch) detected in segment statistics update

maurice1408 commented 10 months ago

In MySQL it is possible to insert timestamp values "0000-00-00 00:00:00" by not setting the NO_ZERO_DATE and NO_ZERO_IN_DATE opt@sql_mode variable.

see

https://dev.mysql.com/doc/refman/8.0/en/sql-mode.html#sqlmode_no_zero_date

On read via the mysql_scanner extension duckdb is throwing the reported error

Error: Conversion Error: timestamp field value out of range: "0000-00-00 00:00:00", expected format is (YYYY-MM-DD HH:MM:SS[.US][±HH:MM| ZONE])

┌─────────────────┬────────────┐
│ library_version │ source_id  │
│     varchar     │  varchar   │
├─────────────────┼────────────┤
│ v0.9.2          │ 3c695d7ba9 │
└─────────────────┴────────────┘
Mytherin commented 10 months ago

I've fixed the zero-date and zero-timestamp issues in #26 and pushed a test in #27. I haven't managed to reproduce the unicode issue using my local MySQL instance (v8.1.0) so that might be an issue related to older MySQL versions.

Mytherin commented 10 months ago

Might have a fix for the unicode issue in #28 by setting character_set_results to utf8mb4 explicitly.

ulissescappato commented 10 months ago

I've fixed the zero-date and zero-timestamp issues in #26 and pushed a test in #27. I haven't managed to reproduce the unicode issue using my local MySQL instance (v8.1.0) so that might be an issue related to older MySQL versions.

Thanks, @Mytherin!

How can I test it? Do I need use "FORCE INSTALL mysql"?

Mytherin commented 10 months ago

Try force install mysql_scanner from 'http://nightly-extensions.duckdb.org';

ulissescappato commented 10 months ago

Try force install mysql_scanner from 'http://nightly-extensions.duckdb.org';

Now it's working. Thank you very much, @Mytherin!