Closed Andrew-S-Rosen closed 4 months ago
Tagging @munrojm. Fixed it, I think! If you wouldn't mind minting a new version after this is merged, that would be a huge help.
Thank you @jmmshn for your suggestion. It worked like a charm! Not sure what I was doing wrong before.
I will never end the crusade against custom as_dict
/ from_dict
Hey I think this broke atomate2, see this atomate2 issue which I've reproduced with emmet>=0.78.0, and couldn't reproduce with emmet<=0.77.1
It's probably better to solve the root cause of the problem in atomate2 (or monty, if relevant) rather than patch back in a custom .as_dict()
method that also doesn't serialize to a dictionary but rather a string.
Maybe I'm missing something, but this PR changed the behaviour of these objects. That doesn't seem like an atomate2 issue as we were using these objects as intended.
The docstring specifies that ValueEnum should serialise to a string.
There's a jsanitize
call in jobflow with strict=True
passed that would require VaspObject
to have an as_dict
method, which was removed here. This appears as a bug in atomate2, so I agree with @Andrew-S-Rosen and @utf that the placement of a fix should change. I'll try to work out where that is best suited
The point is that we want to serialise these objects as strings. That's why we were using ValueEnum.
I'd ask if you want to have enums that serialise as dicts then please just use regular Enums (now supported in monty).
I guess there are a few things to unpack here.
In emmet, and therefore in atomate2 and other packages, there are a few objects inheriting from ValueEnum
s. The example I mentioned in #944 is one example. When a dictionary or other object containing a CrystalSystem
is dumped with dumpfn
, an error is raised. That's because the custom .as_dict()
method is returning a str
. @jmmshn suggested removing the method, which was what was done in this PR.
The ValueEnum
in emmet is, admittedly, advertised as a way to serialize the object to str
. So, I can see the argument there for retaining the .as_dict()
method to be a str
return. However, the method is called .as_dict()
. In my opinion, we should not have a method like .as_dict()
returning something that is not a dictionary. This has downstream negative effects, as mentioned in the example above, and is a hack.
I suppose the question to ask is there a better way to achieve the desired str
serialization without relying on such a hacky/fake .as_dict()
method? I would argue the answer is yes --- inherit from Enum
and take advantage of monty's enum_values=True
flag.
Now that the situation is a bit clearer to me and we also see some of the downstream effects, perhaps the better approach would simply be to re-add the .as_dict()
method but deprecate the entire ValueEnum
class, although maybe that is considered to be even more drastic.
I think this PR and issue #914 are misunderstanding what ValueEnum
is for. There many objects in Python that cannot be serialised as json by themselves. For example, str
, int
, float
, list
etc. Instead, they have to be serialised as part of a dict. The fact that this does not work
from monty.serialization import dumpfn
from emmet.core.symmetry import CrystalSystem
sym = CrystalSystem("Triclinic")
dumpfn(sym, "test.json")
is not an issue in itself. E.g., the following code does not mean that string serialisation is broken:
dumpfn("Triclinic", "test.json") # will fail
The goal of ValueEnum
is to define a set of string literals in the form of an enum. When we serialise these in a task document, we want to only store the enum value. E.g.
{
"crystal_system": "Triclinic"
}
rather than
{
"crystal_system": {
"@module": "emmet.core.utils",
"@class": "ValueEnum",
"value": "Triclinic",
}
}
The latter makes it convoluted to query the task document easily.
Perhaps there is a better way to achieve this and I agree that the as_dict
labelling could be confusing. However, the point is this code has been in production for several years and works absolutely fine. The suggestion of enabling enum_values=True
is not a good fix in this case, since this will mean it isn't possible to store ANY enums in their dict form. ValueEnum
combined with enum_values=False
provides a mechanism to have two types of Enum serialisation behaviour.
To summarise, the issue raised in #914 is not a bug it is the intended behaviour as per the docstring and existing usage. This change has broken atomate2 and other downstream packages. Please can we revert it.
Closes #914.
This now works.