databricks / koalas

Koalas: pandas API on Apache Spark
Apache License 2.0
3.32k stars 356 forks source link

Koalas and pandas read csv result is different #2206

Open tommyhj217 opened 2 years ago

tommyhj217 commented 2 years ago

I tired to read csv. And koalas and pandas show different result.

Below is one column "[""a"",""b""]"

Pandas returned below result image

But koalas returned 2 columns image

I already tried escape='"'. It also showed same result.

I thought pandas result is right so I want to get same result by koalas. Thanks please solve this issue

itholic commented 2 years ago

escape='"' measure seems like working for me??

CSV

# test.csv
col1,col2
"[""a"",""b""]"

Python code

# pandas
>>> pd.read_csv
        col1  col2
0  ["a","b"]   NaN
# Koalas
>>> ks.read_csv("test.csv", escape='"')
        col1  col2
0  ["a","b"]  None

pandas and Koalas show me the same result with escape='"'

If it's still not working, could you give me a more detailed context of your situation ??