sfu-db / connector-x

Fastest library to load data from DB to DataFrames in Rust and Python
https://sfu-db.github.io/connector-x
MIT License
1.86k stars 147 forks source link

Mysql text format is interpreted as bytes instead of string #464

Open Syndorik opened 1 year ago

Syndorik commented 1 year ago

What language are you using?

Python

What version are you using?

0.3.1

What database are you using?

MySQL

What dataframe are you using?

Pandas

Can you describe your bug?

When querying a column from a table which has the format type text, values are not of type str inside the dataframe but of type bytes. This behaviour doesn't happen on varchar types

What are the steps to reproduce the behavior?

Query a table with text fromat, and check the type of the values (is bytes, should be string)

Example query / code
df = cx.read_sql(
            "myuri", "SELECT table.text_column FROM table;
        )

What is the error?

No error, strange behavior I wanted to note, that this doesn't happen with pd.read_sql

mbarki-mohamed commented 1 year ago

I'm not sure if I correctly understood your issue, but I tested the code below where I have two columns in my table "varchar_val" contains data of type "varchar" and "text_val" which hold text values. `import connectorx as cx

query = "SELECT text_val, varchar_val FROM test"
conn = "connectionuri....."
print(cx.read_sql(conn, query)["varchar_val"].dtypes)
print(cx.read_sql(conn, query)["text_val"].dtypes)

which returns "object" for both, this is the correct behaviour of pandas.

jvovk commented 1 year ago

@mbarki-mohamed this is what @Syndorik meant

Screenshot 2023-04-03 at 17 09 15

the data inside the column is bytes, not str type

jvovk commented 1 year ago

@mbarki-mohamed is it something that is going to be fixed?

Syndorik commented 1 year ago

Sorry I totally forgot that issue, the issues deals with what @jvovk describe. The data inside the column are considered as bytes instead of strings

Syndorik commented 8 months ago

I'm sorry to reopen the subject, but Is there any work on that planned?