yhat / pandasql

sqldf for pandas
MIT License
1.31k stars 184 forks source link

Support for this project #63

Open javadba opened 6 years ago

javadba commented 6 years ago

From the title of one of the "recent" (ordinal but not calendar wise) commits 15511b9

"this is my project. there are many like it, but this one is fucking mine"

Well - then are you planning to do anything with it? There has been little activity since that post and NONE in the past ten months - and the issues are piling up.

When pandasql works - it is v v useful. But it does have many bugs - do you have a suggestion on a path forward here?

javadba commented 6 years ago

Anyone have a more active fork of this repo - which is essentially moribund?

zbrookle commented 3 years ago

@javadba I have a package that I created and am actively maintaining called dataframe_sql, which parses sql and translates it to native pandas operations. While that won't fix this package unfortunately, it does provide an active alternative, and I welcome any pull requests or suggestions that others may have

javadba commented 3 years ago

Hi Zach: sql is a tough nut to crack if you're doing it from scratch. It appears you have hit the core set of grammar pieces including group by / having and solid set of joins.

Some of the additional features that might take time:

Good luck on the project: i will keep an eye on it.

stephenb

On Wed, 19 Aug 2020 at 14:41, Zach Brookler notifications@github.com wrote:

@javadba https://github.com/javadba I have a package that I created and am actively maintaining called dataframe_sql https://github.com/zbrookle/dataframe_sql, which parses sql and translates it to native pandas operations. While that won't fix this package unfortunately, it does provide an active alternative, and I welcome any pull requests or suggestions that others may have

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yhat/pandasql/issues/63#issuecomment-676728468, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADEJEKOGBYCMMGI72BCS3DSBRBKJANCNFSM4EHSO6LQ .

zbrookle commented 3 years ago

Hey Stephen,

I've actually just been holding off on implementing some of those things, but they're not as hard as you might think because this package actually relies on a different package I wrote, which compiles it into a set theory library called ibis, which then interfaces directly with pandas. Part of the problem is that it's hard to know what other features people want when they don't raise issues, or ask for features, but I really appreciate your insight and I'll work on adding those.

javadba commented 3 years ago

The analytical / windowing functions are a solid chunk of technical goodies if you're hungry. The string/numeric/timestamp would be a huge undertaking due to their breadth: might more likely be a case of choosing the most commonly used ones and then as you said to see which ones are more generally clamored for to be added. Sometimes those represent easy "getting started" PR's for other folks to add. that's how I got my start in contributing to Spark SQL 6 yrs ago.

On Wed, 19 Aug 2020 at 16:08, Zach Brookler notifications@github.com wrote:

Hey Stephen,

I've actually just been holding off on implementing some of those things, but they're not as hard as you might think because this package actually relies on a different package I wrote, which compiles it into a set theory library called ibis, which then interfaces directly with pandas. Part of the problem is that it's hard to know what other features people want when they don't raise issues, or ask for features, but I really appreciate your insight and I'll work on adding those.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yhat/pandasql/issues/63#issuecomment-676801479, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADEJEMZQH5ROINGMPC6LSTSBRLNRANCNFSM4EHSO6LQ .

javadba commented 3 years ago

my current work is actually a javascript project but I'll be taking a deep learning class as well. Those typically have less direct need for sql type operations than an ML or stats class but I'll keep ur project in mind if some task/s can kinda be considered in that light. i would have been all over it a year back - and i'll need it next year more.

On Wed, 19 Aug 2020 at 16:57, Stephen Boesch javadba@gmail.com wrote:

The analytical / windowing functions are a solid chunk of technical goodies if you're hungry. The string/numeric/timestamp would be a huge undertaking due to their breadth: might more likely be a case of choosing the most commonly used ones and then as you said to see which ones are more generally clamored for to be added. Sometimes those represent easy "getting started" PR's for other folks to add. that's how I got my start in contributing to Spark SQL 6 yrs ago.

On Wed, 19 Aug 2020 at 16:08, Zach Brookler notifications@github.com wrote:

Hey Stephen,

I've actually just been holding off on implementing some of those things, but they're not as hard as you might think because this package actually relies on a different package I wrote, which compiles it into a set theory library called ibis, which then interfaces directly with pandas. Part of the problem is that it's hard to know what other features people want when they don't raise issues, or ask for features, but I really appreciate your insight and I'll work on adding those.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yhat/pandasql/issues/63#issuecomment-676801479, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADEJEMZQH5ROINGMPC6LSTSBRLNRANCNFSM4EHSO6LQ .