Enforce CoreOptions(default_dtype) more broadly?

qutip / qutip

QuTiP: Quantum Toolbox in Python

https://qutip.org

BSD 3-Clause "New" or "Revised" License

1.69k stars 635 forks source link

Enforce CoreOptions(default_dtype) more broadly? #2328

Open nwlambert opened 7 months ago

nwlambert commented 7 months ago

Bug Description

When setting CoreOptions(default_dtype) I expect all quantum objects to inherit the specified dtype. States and operators defined in states.py and operators.py do so but some functions return other data types. like eigenstates(). This can lead to some unexpected behavior (and very inefficient code).

Seems like a good first issue for someone to go around and find functions where this should be enforced and do so?

Alternatively, would forcing CoreOptions(default_dtype) to be used in Qobj creation itself be easier? Though I kinda feel it might cause secondary problems, like if you wanted largely to use a default dtype but then have special cases where you don't, etc etc.

Code to Reproduce the Bug

with CoreOptions(default_dtype="CSR"):
    H = sigmaz() + sigmax()
    print(H.eigenstates()[1][0].data)

Code Output

Dense(shape=(2, 1), fortran=True)

Expected Behaviour

Return sparse states.

Your Environment

QuTiP Version:      5.0.0a2
Numpy Version:      1.26.0rc1
Scipy Version:      1.10.1
Cython Version:     0.29.33
Matplotlib Version: 3.6.3
Python Version:     3.11.7
Number of CPUs:     28
BLAS Info:          Generic
INTEL MKL Ext:      False
Platform Info:      Linux (x86_64)

Additional Context

No response

hodgestar commented 7 months ago

@nwlambert Can you say a bit more about the situation where this arose?

In the example you give, the eigenstates really are dense (although, small, of course), and I think that generically that is true for many states.

Perhaps we need a default_state_dtype and default_oper_dtype?

nwlambert commented 7 months ago

Yeah, on the practical side I was using those states to construct collapse operators, and then using these in steadystate() for a large range of parameters. Bit surprised to see what took 20 minutes in 4.7 took >5 hours in in v5 :0 I guess this is a combo of Liouvillian construction taking much longer as it then converts the Hamiltonian to Dense as well, and steady-state solving taking a bit longer too.

Of course I can manually make those operators sparse (and then v5 just takes 7 mins!), but more generally, I think the logic of default_dtype is nice; essentially allows you to run QuTiP in that 'mode'. Also useful for cases like making jax objects, etc. I have another case where I actually want all operators to be dense, as I use .expm() a lot.

I think most functions in operators.py and states.py follow this logic, e.g., basis() in states.py: dtype = dtype or settings.core["default_dtype"] or _data.Dense

I think it seems to make sense to apply it universally, irrespective of whether a particular object is naturally of one type or another, to avoid the cost of conversion when doing lots of repetitive things, avoid accidentally getting or making a particular type when you expect something different from default_dtype, etc.

edit: Just to add, your suggestion of having two different default_dtypes could be nice too. I can imagine situations were you want to define both separately.

hodgestar commented 7 months ago

The current stats from master show that the fallback is quite mixed:

     12     dtype = dtype or settings.core["default_dtype"] or _data.CSR
     27     dtype = dtype or settings.core["default_dtype"] or _data.Dense
     12     dtype = dtype or settings.core["default_dtype"] or _data.Dia

Perhaps we could create a small object describing the behaviour. Something like:

class DefaultDataType:
    state: ...
    oper: ...

And then one could either do with CoreOptions(default_dtype="CSR"): (which sets everything to CSR) or with CoreOptions(default_dtype=DefaultDatatype(state="Dense", oper="CSR"): (which chooses based on the type of the output).

I don't know if state and oper are the correct set of distinctions to make. We already have some operators where CSR seems to the sensible default and others where the sensible default seems to be Dia. On the other hand eventually on probably wants to combine these operators into one system, so maybe picking a single operator default is the right thing to do.

There is also the question of what should happen when operators are built from states.

Perhaps in the end we can't really manage this for the user and they either have to live with a few simple tools we give them or explicitly set dtypes themselves?

nwlambert commented 7 months ago

Yeah, I see what you mean, I guess adding options may just make things more difficult.

On the original issue, for better or worse, in v5 I find myself using with CoreOptions(default_dtype="..."): a lot. Is it sensible to make sure some more functions here and there, like eigenstates(), also check default_dtype, so we can rely on CoreOptions() to be applied a bit more consistently? Though I guess that begs the question about how far to go, since returned states in solvers and stuff also don't check CoreOptions().

Ericgig commented 7 months ago

I am on the side on having 2 default_dtype options for the Qobj creation functions.

I have some question as to how/where to make default_dtype more consistent.

Applying the default at Qobj creation feels risky to me. In the solver it will end up converting the states before computing the expectation value. It could create strange interactions with operators and unitary transformations (Qobj[Dense].dag() -> Qobj[CSR]). Qobj(scipy_csr) could be converted to something else...

However if default_dtype can be seen as running in that mode, it certainly could cause confusion. It's not clear how it is understood in some places. In eigenstates, if we run in CSR mode, then does that mean that we use the sparse eigen solver? It's a lot worst than the dense one. Or should only the returned ket be in CSR format?

In my tries, the dense steadystate was faster that the sparse one. Could it be an issue that some matrices where too big to fit in RAM forcing to use swap space? We could have a warning when matrices over a certain size are allocated.

ps. Should eigenstates return the states in one operator instead of a list of kets? I guess the states were used to create the operators fed to steadystate, so operator output would be more practical.

BoxiLi commented 7 months ago

Just adding some minor comments here, not really related to the core discussion

We could have a warning when matrices over a certain size are allocated.

I don't think it is good unless we can read the available memory and derive the warning threshold from that. On the cluster, we sometimes have up to hundreds of GB of memory. The threshold should be different from computer to computer

Should eigenstates return the states in one operator instead of a list of kets?

I do often want eigenstates to return one operator. Many times I have to get the kets to NumPy array and recreate the unitary operator from them. Maybe we can have an additional argument to the function.

nwlambert commented 7 months ago

However if default_dtype can be seen as running in that mode, it certainly could cause confusion. It's not clear how it is understood in some places. In eigenstates, if we run in CSR mode, then does that mean that we use the sparse eigen solver? It's a lot worst than the dense one. Or should only the returned ket be in CSR format?

In my tries, the dense steadystate was faster that the sparse one. Could it be an issue that some matrices where too big to fit in RAM forcing to use swap space? We could have a warning when matrices over a certain size are allocated.

I guess this was mostly because I wasn't explicitly calling steadystate with sparse=False so it was getting converted back to CSR anyway, and slowing things down. Largely I see similar performance between CSR and dense (using sparse=False), unless I use very small systems (16x16 Liouvillians), though this seemed a bit scipy/method dependent.

I guess as you said this also raises the question about whether stuff like eigenstates and steadystate() should default to using methods based on the data layer of the object, instead of kwargs? My feeling is not, since eigenstates+sparse can be bad and steadystate+largesystem+dense could be bad, so its worth having some default conversion cost in place. But I still like the idea of what gets returned to the user following default_dtype. but maybe we can see if this turns out to be an issue that people have in using data layers, could just be me!

pmenczel commented 7 months ago

This is just a comment - by coincidence, I also had a situation yesterday where my code was very slow because the matrices I used were accidentally dense. In my case, the reason was that I created operators like

basis(N, i) * basis(N, j).dag()

The default_dtype applies here but, if I hadn't been primed by seeing this issue, it might have taken me a long time to understand what is going on. It is somewhat surprising that qutip would, by default, create vectors / operators with only one non-zero entry as dense.

I am sure there are good performance reasons for that, but it would be good to think about how we can help users not to run into such traps. Applying the default_dtype (or several of them) more broadly is certainly good. Has it been considered to include the dtype information in the output of printed Qobjs?

stijn-de-graaf commented 6 months ago

I wanted to follow up on the previous message. Are there indeed important performance reasons why basis(N, n) is by default implemented densely:

dtype = dtype or settings.core["default_dtype"] or _data.Dense

as opposed to

dtype = dtype or settings.core["default_dtype"] or _data.CSR ?

I similarly have run into issues where much of my existing code using collapse operators of the form basis(N, n) * basis(N, n).dag() takes significantly longer to run (or runs out of memory) in 5.0.

Ericgig commented 6 months ago

Operation oper @ ket is a lot faster for CSR @ Dense than CRS @ CSR. Also for ket, csr matrices still need to have one entry per row, making them not that much more efficient than dense. (A well optimised COO would be nice here.)

But it is only a good choice when they are used as kets, not when used as building tools for operators...

We have functions to create such operators that I thought were more known that are set to use the appropriate sparse default: fock_dm(N, n) is equivalent to basis(N, n) * basis(N, n).dag(). projection(N, n, m) is equivalent to basis(N, n) * basis(N, m).dag().

For now I added an entry for this case in the migration guide.