closeio / ciso8601

Fast ISO8601 date time parser for Python written in C
MIT License
565 stars 45 forks source link

Add benchmarking for a performance lower bound #129

Closed movermeyer closed 1 year ago

movermeyer commented 1 year ago

What are you trying to accomplish?

How fast can ciso8601 go? How close is it already to "optimal" performance?

We can see in the benchmarks that on the machine I test with, it takes ~130 nanoseconds (until #130 is merged) to parse a timestamp without time zone information.

But how fast can you go? 100 nanoseconds? 10 nanoseconds? 1 nanosecond?

What approach did you choose and why?

In order to get a sense for how fast ciso8601 could be, I've added a method that simply asks Python to initialize a datetime object using hardcoded values. This removes all parsing from the equation, and gives us an approximate lower bound for how fast ciso8601 can go. So long as we're using Python's constructors to initialize the datetime object, we will never be faster than this.

I'm not yet exposing the results in the README (and perhaps never will), but would like to eventually include them in a new profiling document I'm drafting.

At any rate, here are the results on my machine:

Method Time taken
_hard_coded_benchmark_timestamp 61.5 ns
parse_datetime('2014-01-09T21:48:00') (v2.2.0) 130 ns

So we can see that ciso8601 is within a factor of 2.1x of the lower bound.

(With #130, this factor much tighter 😏)

What should reviewers focus on?

Any complaints about the name? I started it with an underscore to further indicate that one should not be using this / it's for internal use.

The impact of these changes

We can collect data/benchmark this lower bound. Nothing user facing, unless they go looking at undocumented internal methods.