spine-tools / Spine-Toolbox

Spine Toolbox is an open source Python package to manage data, scenarios and workflows for modelling and simulation. You can have your local workflow, but work as a team through version control and SQL databases.
https://www.tools-for-energy-system-modelling.org/
GNU Lesser General Public License v3.0
72 stars 17 forks source link

New parameter type: dataframe like data format #2497

Open jkiviluo opened 9 months ago

jkiviluo commented 9 months ago

[Updated based on comment and discussion with @soininen --> separating two issues]

Title says a lot, but something where we can have a tight and fast dataframe like data. It should have the possibility to contain array (1-d) or matrix (n-d) data in efficient manner, but should also be able to contain the index headers (including time series). Moving in the array/matrix needs to be fast - basically fixed data element size (int, float, double, etc.). Indexable. Arrow would enable this, there is a separate but related issue: https://github.com/spine-tools/Spine-Toolbox/issues/2506. We could also implement Arrow just for the dataframe like data, but we would be then already halfway to retire JSON blobs all together.

soininen commented 9 months ago

Are you proposing of adding another binary format besides the JSON one? I am not convinced we need that. I would rather see we did something like this:

jkiviluo commented 9 months ago

Forgot that we had this: https://github.com/spine-tools/Spine-Toolbox/issues/1992