Open chelsbells opened 6 years ago
If you use R parameters in the where clause, NULL values don't evaluate to true and are dismissed (so you get an INNER JOIN)
change your sqlLeft to """ SELECT L., R. FROM L LEFT JOIN R ON L.jid = R.jid AND R.date BETWEEN L.start AND L.end """
When running a Left join, rows in the left table without a match in the right table are excluded from results table. This emulates the expected results of an inner join, whereas in a left join, one would expect these excluded rows to remain with nulls for the values of the joined fields.
I've set up the following dummy tables as pandas data frames.
R = pd.DataFrame({'jid':[1,3,1,2,3], 'date':['2000-02-04','2000-01-05','2000-01-30' \ ,'2000-03-10','2000-04-28'], 'amount':[1,2,3,4,5]}) L = pd.DataFrame({'jid':[1,1,2,3], 'start': ['2000-01-01', '2000-01-02', '2000-03-01' \ , '2000-05-01'], 'end': ['2000-01-31', '2000-02-28', '2000-03-31', '2000-05-31'] })
R.date = pd.to_datetime(R.date) L.start = pd.to_datetime(L.start) L.end = pd.to_datetime(L.end)
Table R
Table L
sqlLeft = """ SELECT L., R. FROM L LEFT JOIN R ON L.jid = R.jid WHERE R.date BETWEEN L.start AND L.end """ sqlin = """ SELECT L., R. FROM L INNER JOIN R ON L.jid = R.jid WHERE R.date BETWEEN L.start AND L.end """
resultsL = pandasql.sqldf( sqlLeft, locals() ) resultsin = pandasql.sqldf( sqlin, locals() )
Expected Results of Left Join
Actual Results of Left Join
Results of Inner Join