spine-tools / Spine-Database-API

Database interface to Spine generic data model
https://www.tools-for-energy-system-modelling.org/
GNU Lesser General Public License v3.0
6 stars 5 forks source link

Improve parsing performance by accepting only ISO 8601 time stamps #385

Closed soininen closed 2 months ago

soininen commented 2 months ago

We could gain a lot in performance if we replaced dateutil.parser.parse() by datetime.fromisoformat() in the parameter_value module.

parameter_value module currently uses dateutil.parser to parse any string input. Its strength is that it can parse a wide variety of datetime formats. However, its flexibility seems to make it extremely slow, too.

If I remember correctly, datetutil.parser was chosen for the job because, at that time, we were still supporting Python 3.7 which lacked datetime.fromisoformat(). It is available in Python 3,8, though, which is our current minimum Python version. fromisoformat() is lightning fast compared to dateutil.parser, though it supports only ISO 8601. Still, the potential performance gains from supporting a single time format should overweight the loss of support for other formats.

soininen commented 2 months ago

I just realized we can always fall back to dateutil.parser.parse() if fromisoformat() fails so we can stay backwards compatible.

soininen commented 2 months ago

By the way, this also improves parsing Maps that have time stamps as indices considerably.