teemtee / fmf

Flexible Metadata Format
GNU General Public License v2.0
22 stars 28 forks source link

Migrate `tmt.utils.field` and `tmt.utils.DataContainer` to `fmf` #208

Open LecrisUT opened 1 year ago

LecrisUT commented 1 year ago

It would be nice to have the normalization of the data types in this project.

happz commented 1 year ago

If you mean for the content of fmf files, then such a functionality would require from fmf library to be aware of the schema fmf data must follow, and that's usually owned by the fmf library user, in this case tmt. tmt "knows" whether require is a boolean, a string or a list of complex structures, whether it's required, what's its default value and owns the normalization code. fmf would have to be open to accepting this kind of information, e.g. accepting schema for validation, data classes describing parts of the whole fmf tree, their normalization callbacks, etc. Without this kind of information, the mere move of the classes into fmf would provide no benefit to tmt, and the move itself would not be an easy task.

What benefits do you expect from this migration? Something fmf itself can use for its own structures?

LecrisUT commented 1 year ago

Well this is to help with using fmf outside tmt context, e.g. #207. I think this would be helpful to convert the dictionary data from the yaml fmf file to a typed, e.g. from a str/list[str] to list[Path]. As far as I understand that's the purpose of field and DataContainer.

Ideally it should be linked to a json-schema, but for a first step, getting fmf to load the main data, or preferably sections of it as a given dataclass automagically would be a good usecase for fmf.

happz commented 1 year ago

I see. Well, as I said, it will not be a trivial effort, just moving those two pieces of code into fmf package will not be enough. The workflow I can imagine would be an fmf user defining its data classes and their fields, plus also relationships between them, e.g. in tmt this is often dynamic, based on the value of how key a corresponding plugin and its data class are used. Then fmf "load a tree" entry point would need to accept these and apply them correctly. Not an easy feat :)

I don't want to discourage you, I do see the benefit, instead of amorphous node.data fmf users could get nicely typed instances of their favourite data classes. I just wanted to understand better your use case, and make sure you are aware of the complexity of the task :)

BTW now that Python 3.6 is out of the way, I was actually thinking of replacing the custom data classes with https://www.attrs.org/ or a similar, more "standard" solution, re-building our current validation/normalization on top of this (or similar...) library. I did not invest too much deeper thought into this though, because, well, what we have just works, it's no longer a blocker, preventing tmt development, and what tmt lacks more is the functionality needed for proper Testing Farm integration & reliable testing; I sort of resorted to focusing on that development rather than improving the basic classes we have. But, in the grand scheme of things, if we as a community recognize the normalization as a feature useful enough to be in fmf, it would make sense to me to resurrect this effort and invest some time into investigating whether we can re-implement my code in tmt with attrs, and if so, do it when transitioning it into fmf.

LecrisUT commented 1 year ago

I can take a quick look into attrs before thinking of how to get the minimum framework there. But are there any other libraries to check other than that?

Currently I think a higher priority would be to get the current PRs that I've opened merged. With the click cli, I think it will be easier to build plugins and migrate some code to here.

But after that, it would be nice to have a checklist of what needs to be implemented and a rough order for that so that we can see how to approach it. My idea here is to: