Closed kieferk closed 5 years ago
just some food for thought from adding a small verb in tidr and what I liked about the dplyr/tidr implementation:
@singledispatch
in python) to make the functions work with different inputs. Instead of doing if instanceof(input, Series): ... elif instanceof(input, GroupBy): ... else raise Error()
you write a simple base functions which just raises NotImplementedError
and decorate is with @singledispatch
and then add the rest of the functions for Series, GroupBy and so on. Extending the system for SQL backends is then "only" adding a functions for suitable wrapper around a SQL connection. I wrote a small protoype for it and still want to try it in dfply.I like both of those. I was trying to do something similar to the@singledispatch
thing you're talking about with the new TypeAction
class, but this sounds better in many ways. I will check out your prototype. I'll also see if it would be easy to put the non-standard evaluation functions in and redesign the internals to mimic this kind of functionality. Thanks for the tips!
@janschulz I read over these and I definitely like the singledispatch/registration pattern for these verbs. I think this is a more elegant solution to the one I was building towards.
Maybe I'm being overly optimistic but I think that I could change the base code that deals with piping to fit into this framework without a huge amount of hassle. I just need to make sure it all plays nice with the symbolic X
. Right now i have the selection helpers working in the feature/collapsed-selection
branch, which was tricky to get working correctly. Not sure how it will go down with this setup until I test things out.
Regardless, I think your suggested pattern is the way to go because it is readable and easy to extend. With the winter holidays coming up in a week or two I am going to use that free time to try and change the implementation to match what you've laid out in your blog post.
Since you are changed quite a bit anyway for 1.0.0, maybe you can consider to use filter
(as in dplyr) instead of mask
? I am not sure why you deviated in this particular case, while kept to the verbs of dplyr otherwise.
Also: Could you include the option to filter based on the index?
FInally: Do you have an estimate for when the 1.0.0 version will come out?
Is 1.0.0 abandoned? Or why did you close the topic?
I'm also keen to know!
Hello users,
I have been working on and off on an upcoming version which will be v1.0.0 due to its incompatibility with previous versions. You can actually view this nearly-complete version in the "feature/collapsed-selection" branch.
Originally I was just working on getting the selection helper functions working, but in order to do that a lot had to change with the base decorators. The selection helper functions now work (such as
contains("ca")
for finding columns that contain that string as an argument to the select function. Previously, there were a variety of different decorators that would be stacked together to get different kinds of behavior. In the new conceptualization, the only decorator that will be used is the@dfpipe
decorator and it will take keyword arguments that can change its behavior (it can also be used without keyword arguments in which case it will behave as the current@dfpipe
decorator does now.If you're interested in checking it out and have any questions/comments/concerns, please go ahead. I don't have a timetable for its release but considering it's nearing completion and currently passes all the written unit tests, I don't expect it will be much longer.