kszucs / pandahouse

Pandas interface for Clickhouse database
MIT License
228 stars 70 forks source link

Added quotes to the dataframe at push mess up DateTime columns #30

Open DrissiReda opened 4 years ago

DrissiReda commented 4 years ago

I'm reading data from a clickhouse db and writing to another one.

my script is basically this:

df=ph.read_clickhouse("SELECT * FROM test.testing", index=True, connection=dict(host=url1))
ph.to_clickhouse(df, table='testing', index=False,  chunksize=5000, connection=dict(database='test', host=url2))

The dataframe seems clean This is the output I'm getting

Code: 27, e.displayText() = DB::Exception: Cannot parse input: expected " before: 6-02","Donnees","Tous Sexes","Tous Ages",83,17,"","0000-00-00 00:00:00","0000-00-00 00:00:00"
"51 Marne","2020-04-09","2020-06-02","Donnees","Tous Sexes","Tous : (at row 1)

Row 1:
Column 0,   name: dep,              type: String,   parsed text: "<DOUBLE QUOTE>51 Marne<DOUBLE QUOTE>"
Column 1,   name: timestamp_data,   type: DateTime, ERROR: text "<DOUBLE QUOTE>2020-04-0" is not like DateTime

 (version (official build))
kszucs commented 3 years ago

Thanks for the report, I'm trying to find some time to fix the recent issues.

ndy-cd commented 3 years ago

I got the same problem with inserting DateTime column to clickhouse

danni2019 commented 3 years ago

Hi, is this issue beed fixed yet? Because I am using pandahouse V0.2.7 and still getting the following error:

pandahouse.http.ClickhouseException: b'Code: 27, e.displayText() = DB::ParsingException: Cannot parse input: expected \'"\' before: \'5-19","2021-05-24", type: DateTime, parsed text: "<DOUBLE QUOTE>2020-05-20<DOUBLE QUOTE>,<DOUBLE QUOTE>2021-0"ERROR: DateTime must be in YYYY-MM-DD hh:mm:ss or NNNNNNNNNN (unix timestamp, exactly 10 digits) format.\nCode: 27, e.displayText() = DB::ParsingException: Cannot parse input: expected \'"\' before: \'5-