ModellingWebLab / weblab-fc

Functional Curation backend for the Modelling Web Lab
Other
1 stars 0 forks source link

Trac: Advanced support for reading external data files #191

Open MichaelClerx opened 4 years ago

MichaelClerx commented 4 years ago

https://chaste.cs.ox.ac.uk/trac/ticket/2528

While you can write code to pass data as a protocol input, it would be nice if data files could be referenced directly in a protocol. Both Python and C++ can easily support reading CSV as 1 or 2-d arrays; the Python implementation could also support HDF5. For COMBINE friendliness, we might consider ​NuML too. What would we need to specify beyond the file name? For something like HDF5 that can store multiple n-d arrays, we might need to specify a path within the file. Would we need to allow describing the dimensions with units too? Does there need to be a separate data set section of the protocol, or just reference inline (e.g. my_data = data("file_path.h5", "/group/dataset"))? If the former, where do these fit in the name resolution graph?

Possible examples:

data sets { csv_data = "file.csv" hdf5_data = "file.h5" "/group/dataset" }

In writing this, I'm leaning towards the inline format with a new reserved function data, like we have for map, but welcome feedback.