Closed quinnreynolds closed 2 years ago
As of 5735cde9bb27313dc41290959d4226856ec9c425 I've made some basic changes to get away from the file-centric design, but I'm having trouble seeing a good way forward with this in terms of a broader refactor reflecting best practice and usability. There are a number of obstacles/opportunities.
There are some possible resolutions, but none of them are a silver bullet.
In the spirit of KISS, I'm leaning toward a combination of solutions 4 and 6 above. A "mixture" is really nothing more than a list of Species objects, and things that act on a mixture should perhaps be top-level functions that return results rather than modifying attributes of a Mixture object in-place in potentially inconsistent ways.
I'm getting decision paralysis trying to figure out the best approach up front. For now I may pop out a couple of branches from issue27 to try different approaches and see how they feel in practice.
An implementation of solution 1 above is on branch issue27_test1 (8606feb7daeb4bc7dbc6235aea52c7912bc48a3c). It's actually fairly clean in terms of the code and user interface, but of course it can chew up resources unnecessarily in certain cases as indicated.
An implementation of solution 4 above is on branch issue27_test2 (973455857a7f18bda6756cd59061287ca75c4e24). It's much less usable and more error-prone than I thought it would be; turns out there are a number of internal variables used in Mixture that the various property functions depend on, which makes it hard to create anything like a consistent calling convention for all of them.
I thought this would be the tidiest solution but the worked example has proved me wrong. Branch issue27_test1 looks more promising so will continue development there for now.
It turns out there is a robust and (relatively) low-overhead way of testing if a Mixture is currently in a converged LTE state, by doing a single iteration of the minimiser solver and checking the relative tolerance. This makes solution 2 above feasible.
Implemented in eeda3de7686d154c4b0fe58826aa14a71b90ebb7.
Is there a reason why we're not just marking the mixture as unconverged with a flag, then setting that flag when the user changes properties?
Wait, I see that is option 2. The main issue is therefore the mutability of the properties? Perhaps we can fix that by using immutable versions of those properties?
There's another option - perhaps the things that affect the optimality could be bundled into another object like "State" whose job it is to maintain the thermodynamic state of mixtures.
Pride cometh before a fall as always; I've since found an edge case (one of the notebooks with more complex mixtures) which fails my current convergence-check test. So back to the drawing board on that one.
Thanks @alchemyst - I agree that some sort of flagging when the user changes object attributes would be much more desirable, however, the problem is that some attributes (Mixture.species and Mixture.x0) are themselves objects; my Python-fu isn't strong enough to figure out how to make those change a state flag when things inside them are changed.
Example 1. This I know how to handle with property getters and setters to change a flag:
mymixture.x0 = my_array
Example 2. This I have no idea how to flag as a change:
mymixture.x0[i] = my_value_for_i
@alchemyst - again my fu fails me. How does one make an attribute immutable (especially if that attribute is itself an instance of another object like a list or array, whose elements could be accessed directly by a suitably persistent user)?
I'm also not sure this completely solves the problem. You do actually want the user to be able to change attributes (T, P, x0 particularly) in the course of their investigations. We could get around that by making the attributes completely immutable except for getter and setter methods (which can then flag changes away from LTE etc), but I'm not sure how Pythonic that would be.
The scalar attributes are easy to handle, since floats and ints and so on are immutable in Python. The problem as you correctly mention is if the attributes are mutable. The solution is to use immutable versions of those things. So instead of a list use a tuple, or if the attribute is an array set array.flags.writeable = False
.
If a user then wants to experiment with different versions they can do something like
m = Mixture(...)
m.x0 = [0.1, 0.9] # this converts to a tuple or non-writable array and sets the dirty flag
m.x0[0] = 0.2 # exception - immutable object
m.x0 = [0.2, 0.8] # this is OK
# more changes
m.method_which_requires_recalc() # causes exception _or_ transparently recalcs
If it's important to you to allow users to change for instance x0
in place, we could create an object which notifies its parent of calls to __setitem__
, but that seems a bit of a hack.
We could also stop write access to all of these attributes completely and just have a change_state()
method which handles all state changes and has default None arguments for all the ones you don't want to change.
Thank you Carl; this is why you are the Chief Python Officer ;). Your first option above is pretty much exactly what I was after, I'll try an implementation first thing next week. Tuples are probably the easiest way, will start with that approach.
Of the four attributes which define what values will be obtained from the LTE composition and physical property calculations - species list, x0 list, T, and P - the temptation/need for users to change things inside a particular object instance is probably in the following order: T & P > x0 > species. In particular:
Implemented in 4c505dabb36d9bb7892f152a1daaa1a0acc12663. Nice and tidy, raises exceptions in all the typical misuse cases.
A dogged user is still able to mess things up by changing attributes inside the species objects themselves (eg mymixture.species[i].ionisationenergy = some_value
) or assigning double-underscored variables directly by their mangled names, but these would be pretty malignant events and are highly unlikely to happen by accident.
Beautiful. Now we just need some tests which ensure these properties stay immutable (so check that attempts to change them raise errors) and we're good to go!
Draft changes complete and pulled to branch cleanup. See e3ecdf1bc45a789fcbffdb9a6fd8525519b3317c.
Original report by Quinn Reynolds (Bitbucket: kittychunk, GitHub: kittychunk).
Per suggestions by @alchemyst, the use of the Mixture class as a grab-bag of various loosely associated pieces is poor OO and should be revisited. In particular:
The comments made in #25 re: the (ab)use of JSON files, and changing to a class-focused rather than file-focused code are germane to Mixture also.
Exactly what a Mixture is (and isn't) should be more clearly defined in terms of how it is used by both the code and the end user. The use-case document I'm preparing will hopefully help us to understand this better. Relatedly...
...Mixture probably works better as a relatively simple data structure. It can retain some of the preparation code that's currently in Mixture.__init__, but most of the methods (composition solver, property calculators etc) could be moved out to top-level package functions which take a Mixture of Species as an argument.
It's not entirely clear to me yet whether the Mixture class should also contain composition information (species numbers & number densities), or whether this should just be something that is returned by the composition solver function and then passed as an argument to the property calculators. I'm leaning toward the latter but let's look at the use-case first.
As with #25 this change will require a not-insignificant refactoring of the code, and I suggest we prioritise it after we have all had a chance to go through the typical use case document that I'm busy preparing.