Closed jmeyers314 closed 5 years ago
I think this ought to eliminate a pointer dereference in a tight loop, so maybe speed things up a bit.
So I just checked this, and for interpMany
, it seems the improvement is ~3-8%. So not huge, but not nothing.
While you're at it, any chance you could try implementing Lanczos too? (cf. #751) Or if not, maybe at least think a little bit about the way the code is structured there, so the right hooks are in place to eventually include that.
any chance you could try implementing Lanczos too?
Sure, I'll take a look.
I have a design question: there are a few options for potential restructuring to avoid the dereference in the inner loop:
template <Interp>
lines in front of every function definition, and unnecessary Table2D<> symbols into the module.Thoughts?
I'm not sure if this ends up saving the function pointer dereference or whether it just moves it to the vtable, but 3 seems like the neatest in terms of the code organization.
I would just have the existing Table2DImpl (defined in Table.h) be a base class that had templated subclasses. All the subclass definitions could be fully specified in Table.cpp. Then the switch statement would move from the Table2DImpl constructor over to the Table2D constructor. So it would set the pImpl to point to the right subclass when first building the Table2D.
As you said, this means the only code changes would be in Table.cpp, which is nice. It means it retains API compatibility in the C++ layer (not that we make any guarantees of that, but still). And it does seem like a nicer way to organize the existing code rather than use the function pointer object. But I'm not sure whether that gets you much improvement in run time or not.
I'm not sure if this ends up saving the function pointer dereference or whether it just moves it to the vtable, but 3 seems like the neatest in terms of the code organization.
I think if I can arrange to only query the vtable once per call to interpMany
, instead of once per call to lookup
, it should still be faster.
Or at least potentially easier for the compiler to optimize it in the loop. Possibly even being able to inline it.
It looks like clang won't allow you to derive from a protected nested class outside of the class (gcc seems happy to do it though). (https://godbolt.org/z/8ifl5q). I think this means I may have to slightly mess with the structure of Table2D.h by denesting the TableImpl class.
Or just make it public. There's not really any important reason to keep it protected other than just a code-level communication that it's not something the user should think about.
Closed via #986
I've been playing around with bicubic interpolation for
batoid
surfaces, so thought I'd add them here too. (Might enable us to use less finely gridded atmospheric screens, for instance).In 2D, the type of cubic spline most commonly used (a Catmull-Rom spline) is a bit different than what we use in GalSim for the 1D cubic spline (a natural spline). The natural cubic spline ensures continuous first and second derivatives (requires solving an Ngrid x Ngrid band diagonal matrix equation). The Catmull-Rom spline, in contrast, only has continuous first derivatives, but is more locally defined, requiring no nPoint x nPoint matrix solutions. I think this is actually pretty important, since in 2D, the matrix would really be nGrid^2 x nGrid^2 where nGrid is the number of points on one side of the grid, and the matrix would be less bandy too, (though still Toeplitz I think(?)).
One other thought I had while tinkering with the lookup tables is to template on the interpolation enum, rather than using an interpolation-type-targeted class member function pointer as we do currently. I think this ought to eliminate a pointer dereference in a tight loop, so maybe speed things up a bit. (Though maybe not, I'll benchmark to find out). The downside is a bit more bookkeeping since there are now separate class template types for each interpolation type.