Open soininen opened 4 months ago
Very good, I'd be looking forward to see the results! I understand you plan to keep the 'public' API from spinedb_api.parameter_value
but just change the internals, right?
I understand you plan to keep the 'public' API from spinedb_api.parameter_value but just change the internals, right?
I am not planning to change parameter_value
at all but add a new module next to it. I think we should leave parameter_value
as-is for backwards compatibility if we ever make the full switch to Arrow.
The new module (spinedb_api.arrow
?) should emulate the interface of parameter_value
. I guess the most important functions would be from_database()
which returns an Arrow object and to_database()
which converts an Arrow object to a binary blob.
Sounds good! But ParameterValue
and its subclasses are also 'public' - do you think it's possible to implement them in arrow?
But ParameterValue and its subclasses are also 'public' - do you think it's possible to implement them in arrow?
In fact, I am going to drop ParameterValue
and just use the Arrow data types. I see no benefit in wrapping working data types in interfaces that do not offer any real improvements and can be considered niche. Client code can then work directly with standard Arrow API without the need to convert to/from ParameterValue
.
(This issue is not about the binary blobs we have in the 'value' column of 'parameter_value' table in Spine database scheme.)
We have been discussing using Apache Arrow as an alternative for the data structures in
parameter_value
module. To get things rolling, I though I could get my hands dirty with Arrow by implementing an equivalent toparameter_value
module which deals with Arrow tables instead ofTimeSeries
,Maps
and whatnot. Initially, this will be more like a technological demo or proof-of-concept. Also, I am not planning to replaceparameter_value
, rather provide an alternative interface for parsing parameter values.