Allow SolutionArray to append more than one state at a time

bryanwweber commented 4 years ago

Abstract

In the course of working on Cantera/cantera#838, I found a few edge cases where the SolutionArray interface might be able to be improved.

Problem Descriptions

When Cantera/cantera#838 is merged, it will be possible to assign to slices of extra items. However, it is not possible to assign to slices of properties of the underlying Solution. This is inconsistent, and should either be possible or raise a useful error message. Right now, it just fails silently:
```
In [1]: import cantera as ct
In [2]: gas = ct.Solution('air.yaml')
In [3]: states = ct.SolutionArray(gas, 3)
In [4]: states.T
Out[4]: array([300., 300., 300.])
In [5]: states.T[0] = 100
In [6]: states.T
Out[6]: array([300., 300., 300.])
```
Moved to #58.

When appending a state, an item of length greater than 1 can be passed for any extra items. This raises a DeprecationWarning from NumPy, and gives a... less than useful result:

In [8]: states = ct.SolutionArray(gas, 3, extra=("prop"))
In [9]: states.append(TPX=(1100, 3e5, "AR:1.0"), prop=[1, 2, 3])
/Users/bryan/GitHub/cantera/build/python/cantera/composite.py:671: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
 self._extra[name] = np.array(v)
In [10]: states.prop
Out[10]:
array([101325.00000000003, 101325.00000000003, 101325.00000000003,
      list([1, 2, 3])], dtype=object)

Yes, that's a list as an item in a NumPy array. Note the first three elements (101325.000...) are garbage data that was available at the memory address that np.empty() used to create the array for prop. Moved to Cantera/cantera#895.

It would useful to be able to append more than one state at a time, but this isn't possible:

In [11]: states = ct.SolutionArray(gas, 3)
In [12]: import numpy as np
In [13]: states.append(TPX=(np.linspace(100, 150, 3), 3e5, "AR:1.0"))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-7d5241c26bfa> in <module>
----> 1 states.append(TPX=(np.linspace(100, 150, 3), 3e5, "AR:1.0"))

~/GitHub/cantera/build/python/cantera/composite.py in append(self, state, **kwargs)
   650                     "'{}' does not specify a full thermodynamic state".format(attr)
   651                 )
--> 652             setattr(self._phase, attr, value)
   653
   654         else:

~/GitHub/cantera/interfaces/cython/cantera/thermo.pyx in cantera._cantera.ThermoPhase.TPX.__set__()
  1180             P = values[1] if values[1] is not None else self.P
  1181             self.X = values[2]
-> 1182             self.thermo.setState_TP(T, P)
  1183
  1184     property TPY:

TypeError: only size-1 arrays can be converted to Python scalars

References

Cantera/cantera#838

speth commented 4 years ago

it is not possible to assign to slices of properties of the underlying Solution. This is inconsistent, and should either be possible or raise a useful error message. Right now, it just fails silently:

I was going to write that there was no way to make the return value immutable, but it turns out that you actually can (more or less) - https://stackoverflow.com/questions/5541324/immutable-numpy-array. It would probably make sense to set this flag on arrays returned by Solution as well as SolutionArray, e.g. gas.molecular_weights and gas.net_production_rates.

I don't think there are any cases where making properties mutable in this direction makes sense -- we don't normally allow single properties like T to be set, and the property pairs return tuples which can't be set this way, but can be set using states[i].TP = ....

When appending a state, an item of length greater than 1 can be passed for any extra items.

This seems like an ordinary bug, so maybe it would be better to track that as an Issue on the main repository.

It would useful to be able to append more than one state at a time, but this isn't possible:

By analogy with the methods on the list type, I would suggest that the append method is used for adding one element, and extend is used for adding multiple elements.

bryanwweber commented 4 years ago

it turns out that you actually can (more or less)... I don't think there are any cases where making properties mutable in this direction makes sense

I agree. I'll make a new enhancements issue for this since it is broader than just SolutionArrays, see #58.

This seems like an ordinary bug, so maybe it would be better to track that as an Issue on the main repository.

I wasn't sure, since someone could hypothetically want a sequence of some sort embedded in the array. But since it raises a DeprecationWarning it probably shouldn't be allowed, so I'll file an issue on Cantera/cantera as a bug, see Cantera/cantera#895

By analogy with the methods on the list type, I would suggest that the append method is used for adding one element, and extend is used for adding multiple elements.

On the other hand, np.append() allows appending one or multiple rows to the array. Since this is called SolutionArray, I wonder whether we should follow the convention of NumPy? I don't have a strong opinion.

I've focused this enhancement issue on just this last point.

speth commented 4 years ago

On the other hand, np.append() allows appending one or multiple rows to the array. Since this is called SolutionArray, I wonder whether we should follow the convention of NumPy? I don't have a strong opinion.

The syntax for np.append requires the appended items to always be list-like, though (or I guess more precisely, have all but one dimension equal to the initial array). For example, to append a single item, you have to write np.append([1,2,3], [4]) rather than np.append([1,2,3], 4). Since SolutionArray.append already supports adding a scalar, I think it makes sense for a method that appends an array to have a different name. In part, that would hopefully limit the amount of guessing that has to be done based on the shape of the input data.

bryanwweber commented 4 years ago

That makes sense, especially the part about reducing the amount of guessing about shape that has to be done :+1:

Cantera / enhancements

Allow SolutionArray to append more than one state at a time #57