aiidateam / aiida-atomistic

AiiDA plugin which contains data and methods for atomistic simulations.
https://aiidateam.github.io/aiida-atomistic/
MIT License
3 stars 7 forks source link

Thoughts on JSON serialisation and `pydantic`? #11

Open mbercx opened 4 months ago

mbercx commented 4 months ago

Looking at the example on how to create a StructureData instance:

from aiida_atomistic.data.structure import StructureData

properties_dict = {
    "cell":{"value":[[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]]},
    "pbc":{"value":[True,True,True]},
    "positions":{"value":[[0.0, 0.0, 0.0],[1.5, 1.5, 1.5]]},
    "symbols":{"value":["Li","Li"]},
    }

structure = StructureData(properties = properties_dict)

And the fact the entire structure data is now captured by the properties, after https://github.com/aiidateam/aiida-atomistic/issues/10 the constructor would effectively only have the properties input. This immediately made me think of just having a JSON serializing method (to_json, from_json, or similar). But then I thought about @sphuber's AEP (https://github.com/aiidateam/AEP/pull/40) - which I still have to read 🙈 - and if we shouldn't be thinking about how to integrate the new StructureData with the concepts there.

I'll have to read up some more before having a solid opinion on what to do, but I already wanted to raise the issue.

sphuber commented 4 months ago

I think what you are proposing would be automatically supported with https://github.com/aiidateam/aiida-core/pull/6255 It would allow defining a schema for the StructureData class through a pydantic model:

class StructureData

    class Model(Data.Model):
        pbc: list[bool] = MetadataField(description='Periodicity along each cell axis')
        cell: list[list[float]] = MetadataField(description='The cell parameters')
        positions: list[list[float | int]] = MetadataField(description='The atomic positions')
        symbols: list[str]= MetadataField(description='The atomic symbol labels')

You would then be able to create an instance from JSON serialized data as follows:

data = {
  'cell': [[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]],
  'pbc': [True, True, True],
  'positions': [[0.0, 0.0, 0.0], [1.5, 1.5, 1.5]],
  'symbols': ['Li', 'Li'],
}
structure = StructureData.from_serialized(data)
serialized = structure.serialize()

Note that your class wouldn't have to implement from_serialized and serialize as these are implemented on the aiida.orm.entity:Entity base class.

I haven't tried this, but may be worth doing, also as a way of testing the pydantic PR in aiida-core.