Open ESapenaVentura opened 1 day ago
Let's set up a branch with this refactoring and decide
I am liking what I see so far - I can replace the validation functions specific to each metadata entity for future expansions (e.g. for EnaSubmissions) by creating the models. Also the validation functions allow for value refactoring, which I find it pretty useful for e.g. dates.
Now I can also provide with a Notebook on how to validate your own samples against checklists not in BSD! pretty cool
Currently, each subclass of
metadata_entity
defines avalidate
method.This could be simplified by using pydantic, passing a subclass of
BaseModel
, but it would imply to define a data model per each new metadata entityIs it worth it?
So far I can think of pros and cons:
Pros:
InputDataError
that would arise from the validation done by pydanticCons:
.entity
via a very thorough__setitem__
). What I want to say with this is: refactoring to setting entity as a e.g. ".entity = BiosampleModel(**metadata_content)" does not seem feasible or even valuable. Just seems valuable for validation