Open toqduj opened 8 years ago
Looks like a bug in pandasql. More specifically with the thing that parses your SQL query and turns the tables into data frames.
Here's another related issue - in a query of the form
SELECT d1.wNAM, d2.wNAM FROM df_old AS d1, df_curr AS d2 LIMIT 10
only the first dataframe is recognised - in the example above, df_curr
would not be recognised.
PandaSQLException: (sqlite3.OperationalError) no such table: df_curr [SQL: 'SELECT d1.wNAM, d2.wNAM FROM df_old AS d1, df_curr AS d2 LIMIT 10;']
I ran into this issue too. I worked fine in the interpreter, but when I tried to use it in my script I got a message that said the table doesn't exist. pandasql.sqldf.PandaSQLException: (sqlite3.OperationalError) no such table: resultdf [SQL: 'select ROWID from resultdf where A != mode_average']
Pulling the whole script out of main solved it for me. Luckily it's just a little tool.
I meet this issue when I set the envs as locals()
, like below code:
pysqldf = lambda q: sqldf(q, locals())
After I change the locals()
to globals()
, then pandasql can find my table.
However, as I want to use this code in a local function, I still don't think this is a perfect solution.
Is there a way we can pass variables in query string?
As far as I understand (pardon my beginner knowledge), sqlite3 supports select from multiple tables. That means that I should be able to simplify this pandasql search:
q = """ SELECT * FROM measurements WHERE measurements.repetitions > 120 AND measurements.sampleID IN (SELECT samples.sampleID FROM samples WHERE samples.localResponsible = "brian" ); """
to:
q = """ SELECT * FROM measurements, samples WHERE measurements.repetitions > 120 AND measurements.sampleID = samples.sampleID AND samples.localResponsible = "brian"; """
(see, for example: http://sql.learncodethehardway.org/book/ex8.html ) However, the latter does not work, it appears I cannot use multiple tables in a "FROM" statement in pandasql. Is this a bug or missing feature in pandasql?