jakelishman / qutip-dispatch

Multi-dispatch prototype for QuTiP data structures
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Few questions #1

Open Ericgig opened 4 years ago

Ericgig commented 4 years ago

Hi @jakelishman , Not sure where to discuss this, so here I am.

Can method be dispatched? Could we have obj1 + obj2 with proper dispatch at the data layer level, or only at the Qobj level?

Will there be dispatch for function/method that only use one data-layer object. Say dag, if defined as a method, layer.dag() always calls the right function. So no dispatch needed. Thus only function which take multiple Qobj as input will need dispatch. I guess, functions that take an arbitrary number of Qobj as input (tensor) will not be dispatched but call such function.

Have you checked if the registered function can be pickled and if it could cause a problem? I expect it to be fine, but better catch it early.

Since this will be decorating python function, does it need to be in cython? Is it for speed? All Qobj manipulation by solvers and users will be in python at the Qobj level. So the dispatch will have to choose which cython function to call, but from python. Using a cython object seems overkill.

500 ns is good. There are only few places where this may be felt: mul_vec and expect, as they can be used millions of times in a solver evolution, but they probably don't even need dispatch.

We need a name for the data layer object soon.

jakelishman commented 4 years ago

Thanks for the questions! This is really useful having someone look at my code all the time!

Method dispatch

The output of Dispatcher at the moment won't currently bind to self correctly if you assign it to be an instance method because it's already an instance method of a different object, but an easy shim is just

@data.dispatch(inputs=('a', 'b'))
def multiply(a, b):
   ...

@multiply.register(...)
...

class Data:
    ...

    def __mul__(self, other):
        try:
            multiply(self, other)
        except TypeError:
            return NotImplemented

I can look into making this work properly, but it's not entirely a part of the design. The "light" version of the data structures wouldn't have __add__, __mul__ methods or these things defined, because they couldn't sensibly handle the multiple dispatch necessary to make everything "just work", and Cython doesn't support the necessary casting to make this type-safe. Internally, I believe Cython extension types don't carry around bound instance methods anyway, since you can't override them - it just converts calls into regular function ones.

These dispatch methods aren't meant to be called directly by the user, typically - most of the time, they'll never even see a data layer object, and it's not a problem if CSR + Dense doesn't work with infix notation. Instead, Qobj will implement all the Python niceties (__add__, __mul__, etc), and internally convert those into calls such as data.multiply(a, b). This side-steps the need for method binding, and it allows us to register known specialisations dynamically at import time (and at run time!) while each individual data class is only concerned with itself. I think when we get into adding large tracts of mathematical code for a second data type and having them inter-operate, that will really keep code complexity and duplication down.

Unary functions and tensor

This again comes down to a light/heavy split, and depends on whether .dag() is part of the core specification. In our case, for efficiency reasons, I'm sure .dag() is core enough that we'll just make every class implement it and it's not a problem. A larger concern is unary functions like ptrace, which we won't require every new data type to have an implementation of. I would suggest that these are still passed through the dispatcher, since the additional functionality of this will be the automatic casting rules if no particular specialisation is found.

tensor itself will not be a dispatch function, but its internal data call kron will be, which as it stands is currently a binary function (so a prime candidate for dispatch). tensor is an operation that only makes sense on quantum objects which have a dims parameter. Data objects will only have a shape, because they're not concerned with what their data represents, only core mathematical operations on it.

Pickling

It won't have a pickle method attached by Cython because it's a non-trivial extension type (as is _Binder), but hopefully I can simply define their __reduce__ method and it'll work. I'll do that soon. All the underlying data types can be pickled, so it should be fine.

Cython vs Python

The decorator function itself is written in pure Python, it's only defined in the Cython module because it belongs with Dispatcher just for logical consistency. I don't know of any problem that will come from that - it can still be imported as a Python object (in my example here, __init__.py just imports it into the data namespace), it's just one that gets compiled into a Cythonised extension module. If anything, adding a separate .py file to put it in just complicates proceedings.

Dispatch time

I think this is something we need to profile on before we start optimising, but my general intention was that the solvers and other parts of the code that will make thousands of data-layer calls without user intervention would first homogenise their inputs, use the dispatch method get to retrieve the function with signature f('T, 'T) -> 'T (where 'T is a single type), and then use that internally. This still allows generic types - the homogenisation pass will convert them all to a type 'T which we don't have to hard-code, and then you can use Dispatcher.get to dynamically (and only once!) do the lookup to get a function which is closed on type 'T.

The details of the homogenisation routine will come later on, since it's an optimisation and the first order of business is to get everything ready to roll, but logically do you think that will work for what you have in mind?

I think this ought to be fine at the Python level, but it's trickier to get it to work within CQobjEvo - cdef class types can't currently be templated themselves, so a data field can't be a Cython fused type or a C union, since they can't hold any Python type, even cdef class. If we really commit to the "light" model of data structures, these can actually be implemented at the very base as a raw C struct or C++ class, and wrapped by Cython. Then we can store them in a union, and we get proper type safety for CQobjEvo. In this model we'd then have thin Python wrappers to allow Python itself to hold them, and to attach the methods like .dag() and whatever.

Naming

Suspect @ajgpitch will be excited to get into this part! Personally I'd likely just call it (fully qualified) qutip.core.data.Data, since it's not going to be user-facing anyway, and it's certainly not going to be imported into the global namespace. I don't think anyhing will actually ever instantiate an instance of it nor call a method from it. That will all be handled by the subclasses. I'd probably just call those qutip.core.data.CSR and qutip.core.data.Dense again. These are "advanced user"-facing, but I still don't think we should import them into the global namespace. Advanced users can get them from qutip.core.data, and regular users won't have thousands of symbols.

I'm very happy to open this one to a wider audience. Personally I'm a big fan of simple names inside qualified namespaces because I think it's much easier to read the context from that, but that's just one person's opinion.

jakelishman commented 4 years ago

I think opening issues like these is probably pretty helpful while we're working, since it all stays localised to the relevant bit of code. Perhaps we'll have to ensure that we put it somewhere more QuTiP-owned as well, so it stays for the future?

jakelishman commented 4 years ago

Perhaps at the end of this project, we may be able to transfer ownership of any temporary repositories to the QuTiP organisation and rename them to things like qutip-5.0-design-dispatch? If possible, they'll keep all issues attached to them, and they can just go out to pasture in the qutip pages.

Ericgig commented 4 years ago

For solver, it is mostly f('T, double time, np.ndarray state) -> np.ndarray.

I don't think it would be hard to make is work in cQobjEvo even if we accept mixing of data layer type within. At this level very few functions are actually needed, all data layer modification and creating is done once before the evolution. So this can be in python. A few function can be method of the data layer: expect, mul_vec, mul_mat. So needed dispatch in cython would be addition and maybe product, if we make our own solvers which accept our data type as states. If we accept that there are only 2 functions that needed dispatch at the cython level, we could just hard code a dispatch or a dispatch_get.

jakelishman commented 4 years ago

The problem is like this: say CQobjEvo is one class, and it needs to store a Data object. If we say that CQobjEvo always converts its inputs to one particular type (say CSR), then there's no problem. However, if we want it to be able to use any data input, then we have to do something like

cdef class CQobjEvo:
    cdef Data constant
    cdef Data *time_dependent

Now the dispatcher function itself has the signature cdef Data expect(Data left, Data op, Data right) (or whatever). This is fine within CQobjEvo, however I can't find a way to actually express the dispatch operation within Cython.

I do something like (ignoring all the generic dispatch machinery in this repo)

cdef Data expect(Data left, Data op, Data right):
    cdef ??? fptr = get_specialised_method(left, op, right)
    return <Data> fptr(<???> left, <???> op, <???> right)

where all the ??? are types that aren't known until run-time. In C++ land you have dynamic_cast<T> and typeid which work on run-time type information to handle the down-cast. We still have the problem of typing the function pointer to have the correct number of arguments, but since we're already in C where there are no keyword arguments, we can duplicate a small part of the dispatch code to allow dispatching functions with one, two or three arguments.

In order to be generic, the types of the input parameters have to go from being (pointers to) Data at the entry point, then internally they need to be dynamically down-cast to (pointers to) CSR or whatever. This is absolutely possible in C++, especially with its templates, but Cython doesn't have it.

Ericgig commented 4 years ago

Since we are still using scipy's differential equation function, the expect and mul_vec needed have the signatures cdef complex expect(Data op, complex[::1] state) and cdef np.ndarray mul_vec(Data op, complex[::1] state). In the stochastic code I wrote in cython, cdef void mul_vec(Data op, complex[::1] state, complex[::1] out) is used. They can be a methods of Data and no dispatch is needed.

For a more general version with Data as state, all we need is cdef Data multiply(Data left, Data right). A way to do it would be:

cdef class Data:
    cdef enum type
    ...

and

cdef Data multiply(Data left, Data right):
    if left.type == DENSE and right.type == DENSE:
        return _muliply_dd(left, right)
    ...

You don't need use a pointer function, and this could be quite fast if the compiler can understand the pattern. Yes, we need to manually add each combinations, but you would probably need to do it in get_specialised_method anyway. It is only needed for addition and multiplication. expect can be build from mutiply so we have an easy fallback.

We could have cQobjEvo only with one type of Data, but if the state can still be an arbitrary Data type, some dispatch will be needed.

jakelishman commented 4 years ago

This if ... elif ... elif ... elif ... elif ... with hard-coded function inputs is what I was trying to get away from with the genericised dispatch functions - you can see in pure Python I can completely avoid it, and in pure C++ I'm fairly sure you can too (see https://github.com/jll63/yomm2 for an example). We can do it if there's no other way, but I think we'll be stymieing maintenance work on the library, because there's going to be a huge amount of copy-paste code for many functions which is going to be a nightmare if the logic ever needs tweaking, plus the more specialisations there are, the slower the lookup is going to be. It'll be fast if we can guarantee that there are only every going to be two valid data types, but that's also the principle we're trying to get away from.

Since this one is dispatched over 3 data types (left, right and the output), it scales as n^3 number of branches for n available data types - even for only dense and CSR matrices that leaves the possibility of 8 branches (which need to be copied in for every single dispatched function we right). Adding another data type ups it to 27, and so on.

Perhaps we could talk a bit about where the speedups from using CQobjEvo actually appear? I wonder if we might be able to maintain the same speeds while re-organising CQobjEvo a little bit. I'm very keen to design a system that will allow QobjEvo and other derived types to access the full power of genericised methods without having to hard-code everything.

Are the primary speed-ups coming from Cythonised loop and/or call semantics? If it's because of reduced allocations internally in various operations, then we can instead ensure that we just define in-place operations in the data layer (just like numpy). That way QobjEvo (and every other part of the library too, not just the solvers) can handle it themselves without the need for everything to become a C type with C calling conventions inside CQobjEvo. We'd still homogenise the stored data types within QobjEvo and have it cache the Python callables it needs from the dispatchers once at initialisation so we wouldn't pay the dispatch time penalty on call.

jakelishman commented 4 years ago

Also, where is the scipy calling cdef functions? I had a look, but I couldn't spot it.

ajgpitch commented 4 years ago

Naming

Suspect @ajgpitch will be excited to get into this part! Personally I'd likely just call it (fully qualified) qutip.core.data.Data, since it's not going to be user-facing anyway, and it's certainly not going to be imported into the global namespace. I don't think anyhing will actually ever instantiate an instance of it nor call a method from it. That will all be handled by the subclasses. I'd probably just call those qutip.core.data.CSR and qutip.core.data.Dense again. These are "advanced user"-facing, but I still don't think we should import them into the global namespace. Advanced users can get them from qutip.core.data, and regular users won't have thousands of symbols.

I like your names, which is very reassuring.

ajgpitch commented 4 years ago

Maybe Csr is better. Definitely if it were CsrMatrix (for instance). Not so sure when the name is just a TLA without any other words

jakelishman commented 4 years ago

If we feel like not making it ourselves, and blaming someone else for the naming decision, PEP8 says "acronyms should be all caps" (https://www.python.org/dev/peps/pep-0008/#descriptive-naming-styles).

ajgpitch commented 4 years ago

Good enough for me.