multiply-org / gp_emulator

Python code for Gaussian Process emulators
1 stars 0 forks source link

Emulator library design #2

Open jgomezdans opened 6 years ago

jgomezdans commented 6 years ago

We need a format for an emulator library. See this

jgomezdans commented 6 years ago

Notes are here

jgomezdans commented 6 years ago

@TonioF I've updated the emulators to use the npz file format, as it's easy and mostly portable. We still need some metadata container describing the npz files.... Any suggestions?

TonioF commented 6 years ago

To stay consistent I propose to either use .json or .yaml to describe metadata. The two files would need to stay together then. Maybe we can zip them. In any case, I suggest giving them the same name (except for the file extension, of course).

What metadata will you be putting into these files?

jgomezdans commented 6 years ago

It's all detailed in the wiki entry I mentioned above. Basically, we shouldn't put them together: the json/yaml file should just have a pointer (URL or local file, they're big and it doesn't make sense to ship all of the library, just get what you need when you need it) to the relevant emulator file.

TonioF commented 6 years ago

So you're saying that, in case the emulator file is stored remotely, we don't need to download it completely but could simply extract the part we need at the moment? If that's the case, just using the metadata file and make it point to the emulator file could be a good way to go. Actually, this topic raises some issues regarding to me on how to integrate the emulator. I have been wondering lately how best to integrate the emulator into the data access component and this discussion directly touches on that. How and where should the emulators be accessed? Maybe we should have a TC about that.