ACEsuit / Polynomials4ML.jl

Polynomials for ML: fast evaluation, batching, differentiation
MIT License
12 stars 5 forks source link

File format for P4ML models #6

Open cortner opened 2 years ago

cortner commented 2 years ago

We either need to adopt the ACEbase.FIO JSON interface, or decide on an alternative.

cortner commented 1 year ago

This is maybe a question for @tjjarvinen and @CheukHinHoJerry and needs some context:

When we started solidifying ACE, JSON3 did not yet exist, and JLD and JLD2 were both incredibly unreliable. So we decided to basically create our own JSON wrappers and enforce that all model components must implement that wrapper, basically converting manually between struct and Dict. The interface is in ACEbase.FIO and requires overloading read_dict and write_dict. This has worked very well.

But it's now many years later, and maybe the rewrite of the kernels is a chance to revisit this decision. Any thoughts?

tjjarvinen commented 1 year ago

The issue with JLD2 is that, if you change something in the type definition, it will break the files. So, if you use JLD2 (don't even consider JDL1) you need to formulate the save, so that it only saves general Julia structures like arrays, dictionaries, strings etc.

The good part with JLD2 is that it binary format with an option to compress, which will save space. The issue with JLD2 is that it is mainly Julia only, if you want the option to move to some other language then JSON3 is probably better. (I have not tested reading jld2 files form some other HDF5 reader)

CheukHinHoJerry commented 1 year ago

I think JSON3 is better too. Given that we don't usually have huge model sizes I think JSON3 works well. It is even more friendly to some users who wants to run ASE in python with ACE.

cortner commented 1 year ago

So let’s look into that. Two questions I have