duckdb / duckdb_mysql

MIT License
45 stars 10 forks source link

Failure to import data of type `geometry` due to unicode interpretation #43

Closed minaguib closed 5 months ago

minaguib commented 5 months ago

What happens?

MySQL tables with geometry columns appear to be mapped to varchar/utf8

This causes hard unicode decoding failures

To Reproduce

Attach to a MySQL DB Try to import any rows which include a geometry-typed column:

D create table minatest as select * from mysql.inventory.unit limit 1;
Error: Invalid Input Error: Invalid unicode (byte sequence mismatch) detected in segment statistics update

Excluding the 1 problematic column succeeds:

D create table minatest as select * exclude(point) from mysql.inventory.unit limit 1;
D

OS:

MacOS 23.4.0 Darwin Kernel Version 23.4.0

MySQL Version:

8.0.32 (AWS RDS)

DuckDB Version:

0.10.0

DuckDB Client:

duckdb_cli

Full Name:

Mina Naguib

Affiliation:

Hivestack Inc

Have you tried this on the latest main branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

minaguib commented 5 months ago

For what it's worth, worth clarifying I'm aware of duckdb_spatial I'm not interested in interrogating the geometry data in duckdb - I simply raised this issue since I feel a simple copy should succeed.

Mytherin commented 5 months ago

Thanks for reporting! Agreed this should work.