pytoolz / toolz

A functional standard library for Python.
http://toolz.readthedocs.org/
Other
4.66k stars 259 forks source link

Module-level higher-order functions (HOFs) #88

Open eriknw opened 10 years ago

eriknw commented 10 years ago

This thread is to continue the discussions from https://github.com/pytoolz/toolz/issues/69#issuecomment-27925825 and related to #67.

We may want to have several HOFs, such as:

and so on.

We already have curried, and a proof-of-concept traced is being explored in #69. We ought to explore how we wish to support HOFs. Usefulness, ease of use, and ease of code-maintenance are all extremely important.

To combine HOFs at the module level, I propose the following kind of API:

In [1]: import hof

In [2]: hof.magic.double.triple.three()
Out[2]: 18

In [3]: hof.magic.inc.inc.inc.one()
Out[3]: 4

The above is real output from a toy package I threw together. A possible issue is one can't generally do from hof.magic.inc.inc.inc import one, two, three, because the module hof.magic.inc.inc.inc is auto-generated and won't exist in sys.modules. Right now I see two options to work around this:

  1. inc3 = hof.magic.inc.inc.inc
  2. pre-initialize the most useful (or all?) combinations in sys.modules so importing will work as expected.

I'll explore further, but feedback for what kind of API and behavior you wish to see is also very important. Oh, and what HOFs you think may be useful.

mrocklin commented 10 years ago

I don't yet have strong thoughts on this. The magic scares me a bit.

eigenhombre commented 10 years ago

Agreed about the magic. In fact that was my objection about curried.

On Nov 12, 2013, at 10:33 AM, Matthew Rocklin notifications@github.com wrote:

I don't yet have strong thoughts on this. The magic scares me a bit.

— Reply to this email directly or view it on GitHub.

eriknw commented 10 years ago

Yeah, this is basically a magical form of compose. Perhaps we should just use compose. Still, it was fun to build the toy package!

mrocklin commented 10 years ago

Curried should be in the sandbox.

I think that a lot of the annoyance of applying higher order functions can be fixed by mutating the existing namespace. Lets see what I mean by that

from toolz import map, groupby, first, second, take, drop, valmap, keymap, get
map, groupby, take, drop, valmap, keymap, get = map(curry, [map, groupby, take, drop, valmap, keymap, get])

We name many of these functions three times, once for import and then twice to curry. In practice (at least in my practice) the curry line can grow quite large. This number three could be reduced to two, one to import and one to curry with something like the following. I use ! syntax below to mark that dohof performs an impure action.

from toolz import map, groupby, first, second, take, drop, valmap, keymap, get
dohof!(curry, [map, groupby, take, drop, valmap, keymap, get])

If we don't want to deal with globals then maybe this becomes the following

from toolz import map, groupby, first, second, take, drop, valmap, keymap, get
dohof!(curry, locals(), [map, groupby, take, drop, valmap, keymap, get])
eriknw commented 10 years ago

Heh, trading magic for magic. By default, I think dohof should inject the names into the local namespace of where it was called. If it injects into the global namespace of a module, then it may change the behavior of other functions within the module, which violates the purity principle. I think a global=False or local=True keyword in dohof makes sense. There are probably nuances I'm missing, so am open to arguments for using globals instead of locals if dohof is added. Accessing the locals (or globals) of the previous frame is easy:

from inspect import currentframe
frame = currentframe().f_back
frame.f_locals  # <-- this is the `locals()` dictionary
eriknw commented 10 years ago

I'm beginning to like dohof more and more (although we should continue to try to come up with a better name). What do you think of options to add a prefix or suffix to the created functions? For example, dohof(curry, [map, groupby], prefix='c') would create cmap and cgroupby.

eriknw commented 10 years ago

Ah, apparently, locals() is read-only (unless it is the same as globals()). I guess we could stick with globals magic. It is convenient and relatively explicit, but it still feels icky. We could still write to locals, but I think it would require eval, which is voodoo magic that I would rather avoid.

mrocklin commented 10 years ago

This works for me

In [1]: locals()['foo'] = 'bar'

In [2]: foo
Out[2]: 'bar'

Note that I'm still not sold on this idea. @eigenhombre would probably say that we should just stick to making functions and let people use the standard features of Python to apply them on their own. I usually agree with him on this point.

eriknw commented 10 years ago

This works for me

That is because locals() is the same as globals() in that context:

In [1]: globals() is locals()
Out[1]: True
eriknw commented 10 years ago

@eigenhombre, I'm curious why you don't like toolz.curried. For whatever reasons, it didn't seem strange at all to me when I first saw it, and it actually struck me as very convenient and pragmatic. As I said, I'm just curious what your thoughts are.

eigenhombre commented 10 years ago

It may be that it's an idiom which I haven't needed much yet (explicit partial has worked fine). There was some history as well as to how it evolved into its current form, which I can't remember at the moment. At the point at which I looked at it more closely, I found it more magical and confusing than valuable. I am sure I could be convinced otherwise now.

[I'm off to Clojure/conj today so will not appear much on Pytoolz for awhile.]

eriknw commented 10 years ago

Thanks for replying. Now I'm interested in the history of curried!

My sense is that we don't have enough need currently to justify a convenient (albeit slightly magical) generalized way to apply HOFs.

Just to see where it goes, I'm going to start a branch that uses import magic like curried, but more generalized. It may be a little magical, but it's not dirty, because python has facilities to allow exactly this sort of thing. Heh, I'm willing to entertain any crazy ideas for HOFs to include, so feel free to share.

eriknw commented 10 years ago

My toy package that utilizes higher-order functions at the module level can now behave like this:

In [1]: from hof.hofs.double.triple import *

In [2]: one()
Out[2]: 6

In [3]: two()
Out[3]: 12

In [4]: three()
Out[4]: 18

In [5]: from hof.hofs.inc.inc.inc import one as one_plus_three

In [6]: one_plus_three()
Out[6]: 4

In [7]: import hof

In [8]: hof.hofs.double.double.double.double.one()
Out[8]: 16

What's more, this behavior is set up from one function call: create_higher_order_module(source_module, module_name, fofs, hofs). The inputs are:

It would be easy to adapt this to use in toolz; for example, curried and traced are potential higher-order functions.

So, if I wanted to continue developing this toy package into something more generally useful, which avenue would you recommend:

  1. Create a new package (which can include example usage with toolz)
  2. Add this to the sandbox

I'm fine with either, and I expect you guys to favor 1. Do you have any ideas for what to name this package?

mrocklin commented 10 years ago

Wow, I didn't think that would be possible. Is this code up somewhere?

eriknw commented 10 years ago

Not yet, but soon. It should be presentable Monday evening or Tuesday. There are exactly zero docstrings--ha!--and I want to add an option.

eriknw commented 10 years ago

What is a good name for a keyword argument that chooses between the following forms: h(g(f))(x) and h(g(f(x))), where f is the first order function and x is the input to that function?

The former is the proper application of higher-order functions (g and h accept a function and return a function), and the latter is convenient for chaining functions together such that the faux-higher-order functions g and h accept the output of the previous function (beginning with the output of f(x)). toolz would probably only use the former form, h(g(f))(x). The latter form can be used for sugar, which I think the user may choose to use for their own projects, but I am skeptical that such sugar is appropriate for toolz, which already has compose, pipe, and thread*.

eriknw commented 10 years ago

Testable module is here:

https://github.com/eriknw/test_hof and primary module of interest: https://github.com/eriknw/test_hof/blob/master/test_hof/_homaker.py

Hopefully it makes sense. Regardless, I'm willing to answer any questions you may have.

eriknw commented 10 years ago

I've been trying to come up with a name if "_homaker.py" were turned into a package (and improved). I only have a few ideas so far:

What do you think? Any other ideas?

eriknw commented 10 years ago

Another name idea: metafunc.

eigenhombre commented 10 years ago

+1 for metafunc

On Nov 21, 2013, at 10:36 AM, Erik Welch notifications@github.com wrote:

Another name idea: metafunc.

— Reply to this email directly or view it on GitHub.

eriknw commented 10 years ago

Cool, thanks for the feedback. I'm willing to proceed with metafunc. As a separate package?

eriknw commented 10 years ago

Alright, metafunc is now on github:

https://github.com/eriknw/metafunc

It still needs documentation (and the documentation it has needs redone), but the core functionality is there and it is tested with 100% coverage.

eriknw commented 10 years ago

Oh, and I want to share that I am enjoying conttest very much:

https://github.com/eriknw/metafunc/blob/master/ctest

eigenhombre commented 10 years ago

:-)

On Nov 24, 2013, at 9:35 AM, Erik Welch notifications@github.com wrote:

Oh, and I want to share that I am enjoying conttest very much:

https://github.com/eriknw/metafunc/blob/master/ctest

— Reply to this email directly or view it on GitHub.

mrocklin commented 10 years ago

What's the status on this?

eriknw commented 10 years ago

For toolz, I think toolz.curried is necessary and (at the moment) sufficient.

If we want a system to apply other HOFs through namespaces (via metafunc, which is functional, but needs documentation and uploaded to PyPI), I think a new packages tooled(?) would be appropriate. This way, it can come pre-packaged with additional dependencies that add valuable HOFs.

Although it would occasionally be convenient to have a tupled or listed namespace while playing with toolz in a REPL, I don't think we need this functionality. It could be nifty for new users though.