monarch-initiative / pyphetools

Python Phenopacket Tools
https://monarch-initiative.github.io/pyphetools/
MIT License
9 stars 1 forks source link

Consider using pydantic #113

Closed cmungall closed 1 month ago

cmungall commented 1 month ago

Pydantic is a modern way of writing python object models and validators

An example of phenopackets encoded in Pydantic is here: https://github.com/bento-platform/katsu/blob/develop/chord_metadata_service/phenopackets/models.py

It looks like the current classes are following quite java-esque paradigms with getters and setters. I think it's more idiomatic Python to use pydantic, this gives you a lot of benefits for free, such as declarative validation and json rendering.

Pydantic can easily be authored directly, it can also be generated from linkml

ielis commented 1 month ago

Hi @cmungall

It looks like the current classes are following quite java-esque paradigms with getters and setters. I think it's more idiomatic Python to use pydantic, ...

If pyphetools.pp package is what you are referring to, then yes, the code is a little java-esque, in terms of making the classes like OntologyClass implement "interfaces", such as Serializable, ToProtobuf, etc.:

class MessageMixin(Serializable, Deserializable, FromProtobuf, ToProtobuf,
                   metaclass=abc.ABCMeta):
  pass

class OntologyClass(MessageMixin):
  ... # some code to implement the abstract methods of `Serializable`, `Deserializable`, `FromProtobuf`, ...

Evaluating pydantic is among my medium to long-term goals. Support for JSON sounds great and if they support protobuf, that would be exceptional!

... a lot of benefits for free, such as declarative validation and json rendering ...

Declarative validation is nice since it is the first round of validation that must be done. However, AFAIK, we also have some more complex constraints in the Phenopacket Schema, such:

Therefore, I think the validation will always be a multi-level process. In other words, a declarative validation for free is very much appreciated, but it is a part of the system.

Thank you for pointing this out!

cmungall commented 1 month ago

Agreed 100% on the multi-level validation.

Evaluating pydantic is among my medium to long-term goals. Support for JSON sounds great and if they support protobuf, that would be exceptional!

Glad this is on the radar, feel free to meld or close this issue into any other. Be great to have a more detailed discussion about some of these plans sometime soon!

I don't think pydantic natively binds protobuf but tbh I haven't seen protobuf used much in the python universe; and it seems that json is the de-facto exchange serialization for phenopackets.

If pyphetools.pp package is what you are referring to

tbh I'm not totally sure. I see there are subpackages .visualization and .creation, with classes like Individual and explicit getters and setters like

https://github.com/monarch-initiative/pyphetools/blob/f097e6a1032e29a49791fa87043ae13c2d9bd4e7/src/pyphetools/creation/individual.py#L82-L88

I wasn't totally sure about the relationship between these (hand-authored I assume) classes and the autogenerated stuff in .pp. I do know my IDE doesn't like the autogenerated stuff but that may be a me problem, I intend to delve deeper....

pnrobinson commented 1 month ago

@cmungall -- The Individual class is intended to organize information from papers. Honestly it is like the appendix, the code has evolved in ways that keep it alive but may not be optimal, but they do the job and there are other priorities. I think that if we move forward with an online server, then we understand the issues well enough to implement in a cleaner way. AFAICS the phenopackets are being generated in a correct fashion so that is the interface to everything else!