SED-ML / sed-ml

Simulation Experiment Description Markup Language (SED-ML)
http://sed-ml.org
5 stars 2 forks source link

Define Data Structures for simulations, tasks & dataGenerators #59

Open matthiaskoenig opened 6 years ago

matthiaskoenig commented 6 years ago

Issue

One of the main problems I have with simulation, task and dataGenerators is that it is not defined what kind of DataStructure they are, i.e. it is unclear which kind of data structure they are, which dimension they have, and what data type they have (all this defines the allowed operations and Math on them). All these points are currently only implicit assumptions, which work for very simple timecourse simulation and steady state simulation, but break down as soon as one wants to encode more complicated simulation experiments than a simple timecourse with an ODE model. This creates also a lot of problems in the outputs which have to work with these data structures. In the current form lot's of problems results from this

This is basically the core issue which creates most of the other issues, i.e. dealing with multi-dimensional data ( #21 ), the new more complex plots ( #20 ), simulation on logical models (#8), how to plot repeatedTasks ( #58 ), calculating math over repeated tasks ( #53 ), the new tasks like Jacobian ( #27 ), ..

Proposal

This will also make implementation much easier and less error prone Exchange formats and outputs can easily be based on the data structures, e.g., for instance boolean transition graphs are a directed-graph, output formats are possible exchange formats for graphs like GML, GraphML

edit: typos and clarifications

fbergmann commented 6 years ago

Ideally the data description element of L1V3 can be used for exactly that purpose. Just like higher dimensional external data can be described, the idea would be that the individual dimensions are described for data generators as well. Then slices could be defined to get at the individual elements.

matthiaskoenig commented 6 years ago

Yes, this should work for all multidimensional data arrays. Especially the new more complex plots will expect (only work with) dataGenerators in certain dimensions and with certain content. For instance a heatmap would expect one 2D-dataGenerator for the actual data (double) and two corresponding 1D data generators for the axes (string or double) to to write what is plotted in each cell of the heatmap.