yhat / pandasql

sqldf for pandas
MIT License
1.33k stars 185 forks source link

SELECT FROM multiple tables #46

Open toqduj opened 8 years ago

toqduj commented 8 years ago

As far as I understand (pardon my beginner knowledge), sqlite3 supports select from multiple tables. That means that I should be able to simplify this pandasql search:

q = """ SELECT * FROM measurements WHERE measurements.repetitions > 120 AND measurements.sampleID IN (SELECT samples.sampleID FROM samples WHERE samples.localResponsible = "brian" ); """

to:

q = """ SELECT * FROM measurements, samples WHERE measurements.repetitions > 120 AND measurements.sampleID = samples.sampleID AND samples.localResponsible = "brian"; """

(see, for example: http://sql.learncodethehardway.org/book/ex8.html ) However, the latter does not work, it appears I cannot use multiple tables in a "FROM" statement in pandasql. Is this a bug or missing feature in pandasql?

glamp commented 8 years ago

Looks like a bug in pandasql. More specifically with the thing that parses your SQL query and turns the tables into data frames.

psychemedia commented 8 years ago

Here's another related issue - in a query of the form

SELECT d1.wNAM, d2.wNAM FROM df_old AS d1, df_curr AS d2 LIMIT 10

only the first dataframe is recognised - in the example above, df_curr would not be recognised.

PandaSQLException: (sqlite3.OperationalError) no such table: df_curr [SQL: 'SELECT d1.wNAM, d2.wNAM FROM df_old AS d1, df_curr AS d2 LIMIT 10;']
kalistibot commented 7 years ago

I ran into this issue too. I worked fine in the interpreter, but when I tried to use it in my script I got a message that said the table doesn't exist. pandasql.sqldf.PandaSQLException: (sqlite3.OperationalError) no such table: resultdf [SQL: 'select ROWID from resultdf where A != mode_average']

Pulling the whole script out of main solved it for me. Luckily it's just a little tool.

verazuo commented 6 years ago

I meet this issue when I set the envs as locals(), like below code:

pysqldf = lambda q: sqldf(q, locals())

After I change the locals() to globals(), then pandasql can find my table. However, as I want to use this code in a local function, I still don't think this is a perfect solution.

noorulPro commented 9 months ago

Is there a way we can pass variables in query string?