potash / drain

pipeline library
MIT License
12 stars 5 forks source link

Writing dataframe to postgres errors in Python 3 #29

Closed shaycrk closed 7 years ago

shaycrk commented 7 years ago

Description

Using PgSQLDatabase.to_sql() in python 3 returns an error (TypeError: a bytes-like object is required, not 'str').

A simple work-around (but perhaps not the best solution?) is create the csv in memory and convert it to a bytes object before passing to stdin for the COPY, that is:

csv_bytes = bytes(frame.to_csv(None, index=index, encoding='utf8'), 'utf8')

psql_out = p.communicate(input=csv_bytes)[0]

instead of:

frame.to_csv(p.stdin, index=index)

psql_out = p.communicate()[0]

at util.py line 456, but there may be a more efficient solution and should confirm this will work in python 2 as well.

What I Did

db = util.create_db()

if os.path.isfile(tract_file):
    tr_acs = pd.read_csv(tract_file, dtype= {'census_tract_id':str})
    db.to_sql(tr_acs, name='acs_tract', schema='static', if_exists='replace', index=False)
potash commented 7 years ago

Fixed #30