kpeeters / cadabra2

A field-theory motivated approach to computer algebra.
https://cadabra.science/
GNU General Public License v3.0
215 stars 37 forks source link

Make join accept an arbitrary number of arguments #275

Closed kpeeters closed 7 months ago

kpeeters commented 1 year ago

Since join replaced + for list concatenation, it is a bit cumbersome to concatenate multiple lists. If we make join accept an arbitrary number of arguments that will improve. See also https://cadabra.science/qa/2468/errors-in-function-evaluate.

dpbutter commented 7 months ago

Here is perhaps a bit hacky solution. In py_ex.cc, one can overload Ex_join to take a vector of Ex_ptr's:

    Ex_ptr Ex_join(const std::vector<Ex_ptr>& ex) {
        if (ex.size() < 2) {
            throw std::invalid_argument("Ex_join requires at least two arguments.");
            }
        Ex_ptr joined = Ex_join(ex[0], ex[1]);
        for (size_t i = 2; i < ex.size(); ++i) {
            joined = Ex_join(joined, ex[i]);
            }
        return joined;
        }

and then modify the "join" binding to

        m.def("join", [](const Ex_ptr ex1, const Ex_ptr ex2, py::args args) {
            std::vector<Ex_ptr> ex = {ex1, ex2};
            for (const auto& arg : args) {
                ex.push_back(arg.cast<Ex_ptr>());
            }
            return Ex_join(ex);
            });

This seems to work, although it generates a runtime error on args rather than a typeerror if they are of the wrong type. I'm trying to understand pybind to see if there is a way to have it automatically handle this.

I also don't entirely understand the memory handling, so probably this is making copies of the lists currently...

kpeeters commented 7 months ago

Memory-wise this is ok, because Ex_ptr is a (shared) pointer to an Ex, so it does not copy the entire expression. This looks fine to me apart from the runtime error, but I do not know off the top of my head either whether there is a nicer method for this (i.e. to enforce that args in pybind are of a particular type).

I can merge the above if you want since for users nothing will change as far as using join is concerned if you manage to find out how to solve the above.

dpbutter commented 7 months ago

I think it's probably inefficient in that I think it makes multiple copies when joining. So if you call join(huge, tiny, tiny), then huge gets copied twice instead of just once (I think). At least in my case, this isn't a big issue since I call joins only when setting up my massive lists at the beginning.

It's fine with me if you merge it. I can then get rid of my overloaded Python version of join. I'm using the excuse of porting my old code to v2 (and sprucing it up) as a reason to learn C++ and Python. :-D

kpeeters commented 7 months ago

Yes, you are right, Ex_join (with two arguments) makes copies. It has to, because you do not want to mess with the original lists. But if you do that many times, it's better to do in one shot, doing the copies only once. I'll take care of that part.

kpeeters commented 7 months ago

I have pushed a change which takes your Python binding but uses a different Ex_join which avoids the copies. Thanks.