yhat / db.py

db.py is an easier way to interact with your databases
BSD 2-Clause "Simplified" License
1.22k stars 111 forks source link

Query using TRUNCATE fails #58

Closed ppope closed 8 years ago

ppope commented 9 years ago

Hi,

Thanks for the very cool implementation.

When connected to an Amazon Redshift database, my attempt to TRUNCATE a table

from db import DB
db = DB() 
db.query("TRUNCATE table_name;")

obtained an error

TypeError: 'NoneType' object is not iterable

Does db.py support the TRUNCATE command?

ericchiang commented 9 years ago

The query function doesn't do anything special if you just give it a string, it just defers to pandas. I suspect this is a driver issue.

Can you tell me the full results of the following code?

import psycopg2 as pg
import pandas as pd
# connect to your redshift database
con = pg.connect(user="username", password="password", host="host", port=port, dbname="dbname")
result = pd.io.sql.read_sql("TRUNCATE table_name;", con)
print result
ppope commented 9 years ago

Thanks for the reply.

Running the following (with my Redshift credentials of course)

import psycopg2 as pg
import pandas as pd
# connect to your redshift database
con = pg.connect(user="username", password="password", host="host", port=port, dbname="dbname")
#
#
pd.io.sql.read_sql("TRUNCATE table_name;", con)

Obtains:

---------------------------------------------------------------------------
DatabaseError                             Traceback (most recent call last)
<ipython-input-15-88f7d66aab7b> in <module>()
      8 #
      9 #
---> 10 pd.io.sql.read_sql("TRUNCATE table_name;", con)

/home/phil/anaconda3/envs/python2/lib/python2.7/site-packages/pandas/io/sql.pyc in read_sql(sql, con, index_col, coerce_float, params, parse_dates, columns)
    419         return pandas_sql.read_sql(
    420             sql, index_col=index_col, params=params,
--> 421             coerce_float=coerce_float, parse_dates=parse_dates)
    422 
    423     if pandas_sql.has_table(sql):

/home/phil/anaconda3/envs/python2/lib/python2.7/site-packages/pandas/io/sql.pyc in read_sql(self, sql, index_col, coerce_float, params, parse_dates)
   1044                  parse_dates=None):
   1045         args = _convert_params(sql, params)
-> 1046         cursor = self.execute(*args)
   1047         columns = [col_desc[0] for col_desc in cursor.description]
   1048         data = self._fetchall_as_list(cursor)

/home/phil/anaconda3/envs/python2/lib/python2.7/site-packages/pandas/io/sql.pyc in execute(self, *args, **kwargs)
   1039 
   1040             ex = DatabaseError("Execution failed on sql: %s" % args[0])
-> 1041             raise_with_traceback(ex)
   1042 
   1043     def read_sql(self, sql, index_col=None, coerce_float=True, params=None,

/home/phil/anaconda3/envs/python2/lib/python2.7/site-packages/pandas/io/sql.pyc in execute(self, *args, **kwargs)
   1028                 cur.execute(*args, **kwargs)
   1029             else:
-> 1030                 cur.execute(*args)
   1031             return cur
   1032         except Exception as e:

DatabaseError: Execution failed on sql: TRUNCATE table_name;
ppope commented 9 years ago

I'm not sure if this is helpful but I noted that queries like

db.query("select count(*) from table_name;")

work as expected on my machine

bmabey commented 8 years ago

The bug is in pandas (or the driver as @ericchiang as mentioned) and not in db.py. I don't know if it is specific to redshift but the underlying issue is that the TRUNCATE and other commands are just that- commands and not queries. I ran into the same issue when trying to set my schema search path. Instead of using the query method using the cursor directly is the better approach, for example:

db.cur.execute("set search_path to blah;")
ppope commented 8 years ago

OK, thanks. Closing this as it is not a bug in db.py.