cloudflare / sqlalchemy-clickhouse

Apache License 2.0
306 stars 105 forks source link

pandas.DataFrame.to_sql using connector doesn't write the last row #10

Open vsmelov opened 7 years ago

vsmelov commented 7 years ago

There is a strange bug - when I try to use this connector to insert rows of pandas.DataFrame to ClickHouse.

import sqlalchemy as sa
import pandas

CLICKHOUSE_DSN = '***SECRET***'
ch_db = sa.create_engine(CLICKHOUSE_DSN)

pdf = pandas.DataFrame.from_records([
    {'year': 1994, 'first_name': 'Vova'},
    {'year': 1995, 'first_name': 'Anja'},
    {'year': 1996, 'first_name': 'Vasja'},
    {'year': 1997, 'first_name': 'Petja'},
])

pdf.to_sql('test_humans', ch_db, if_exists='append', index=False)

When I run it, I can see first 3 rows of pdf in table, but not the last one :-) When I use postgres connector, or, this alternative clickhouse connector https://github.com/xzkostyan/clickhouse-sqlalchemy all works fine.

apyermalkar commented 5 years ago

I guess its late, but I hope it helps others.

I believe you have an extra comma at the end of the last row. while defining the dataframe. That seems to be the reason.

vladimir-golovchenko commented 4 years ago

The is the bug in this row: https://github.com/cloudflare/sqlalchemy-clickhouse/blob/v0.1.5/connector.py#L238

To work around this one need to add a fake row:

pdf = pandas.DataFrame.from_records([
    {'year': 1994, 'first_name': 'Vova'},
    {'year': 1995, 'first_name': 'Anja'},
    {'year': 1996, 'first_name': 'Vasja'},
    {'year': 1997, 'first_name': 'Petja'},
    {}
])