raysect / source

The main source repository for the Raysect project.
http://www.raysect.org
BSD 3-Clause "New" or "Revised" License
88 stars 23 forks source link

Can Function2D etc. be made to support numpy broadcasting? #272

Closed jacklovell closed 5 years ago

jacklovell commented 5 years ago

I often find I want to evaluate the result of a Function2D on a grid of points, which I currently have to do using nested for loops. This is turning out to be a performance bottleneck in various places in my code, and also leads to code bloat.

Would it be possible to enable Function2D.__call__ etc. to accept arrays and use numpy broadcasting rules over those arrays? This would involve looping over the arguments and calling evaluate() for each pair of scalars in the arguments. I think numpy.broadcast would make it reasonably easy to do this.

Ultimately, I'd like to be able to do something like:

func = <some subclass of Function2D>
x = np.arange(10)
y = np.arange(10) + 5
xx, yy = np.meshgrid(x, y)
res = fun(xx, yy)

Rather than:

res = np.asarray([[fun(xi, yi) for yi in y] for xi in x])

This would also be very useful in Function1D and Function3D, with the latter avoiding triple nested loops in Python.

I'm not sure what sort of overhead would be incurred by this, and whether it would cause performance regressions in existing cases where Function2D.__call__ is called with scalars inside a loop (but where vectorising that loop is for whatever reason impractical). I suppose a workaround for that would by something like:

def __call__(x, y):
    try:
        self.evaluate(x, y)
    except TypeError:  # Is this the right exception?
        # Broadcast and loop

The thinking here is that try has essentially no overhead if it succeeds (i.e. x and y are scalars), and the overhead incurred by an exception being raised is offset by the fact that this only happens once in the vectorised call, so shouldn't be inside any tight loops.

I'm happy to have a go at contributing to this, but wanted to check whether it's a worthwhile change before investing time in it.

CnlPepper commented 5 years ago

Have you tried using the sampleXD functions? These sample functions on regular grids.

(edit: The user is using raysect via cherab)

jacklovell commented 5 years ago

You mean the ones in Cherab core? That's closer to what I want, and it does make more sense to call a "sample" function in this instance, but the interface isn't quite right. From what I can tell, the sample2d function for example takes as inputs a pair of (start, stop, nsamples) tuples, and always returns a 2D array. This leads to a couple of problems:

Perhaps it makes more sense to raise this in Cherab, and ask for changes to (or additional versions of) the sampling functions there?

CnlPepper commented 5 years ago

Ok just wanted to be sure the sample functions didn't meet your needs. There is a lot more complexity to the FunctionXD interface than may be apparent. The proposed change could introduce inconsistent behaviour when python functions are used in place of the function objects. It may be better to add a greater range of sampling functions instead.

This change will require some careful consideration.

jacklovell commented 5 years ago

That's fair enough. The more I think about it, the more it makes sense to me to put this sort of functionality in sampling functions instead. That is after all what I actually want to do: sample at an array of points.

Mind you, if the changes would enable Python functions to be called with vector arguments, that'd also provide a nice speed boost (more so than FunctionXD objects?).

jacklovell commented 5 years ago

Having discussed this with Matt, it does indeed seem that the best approach is to provide additional sampleXd functions in Cherab for these use cases. I'll open another issue in cherab-core about this: feel free to close this one as Wontfix.

mattngc commented 5 years ago

Agreed with all parties that this functionality might be better implemented as utility functions in the cherab-core package.