jannikmi / timezonefinder

python package for finding the timezone of any point on earth (coordinates) offline
https://timezonefinder.michelfe.it/
MIT License
463 stars 51 forks source link

improve initialisation time #189

Open jannikmi opened 1 year ago

jannikmi commented 1 year ago

currently the initialisation time of a Timezonefinder instance is around .7 seconds.

This is probably mostly due to reading the polygon index file into memory. Find ways of reducing the startup time without affecting the query time

rosenbergj commented 11 months ago

Data point: I upgraded my script from timezonefinder 5.2.0 to 6.2.0, and execution time in AWS Lambda went from about 350ms to between 8 and 8.5 seconds. I can't say for sure that it was initialization time at fault, but the problem went away when I downgraded back to 5.2.0.

jannikmi commented 11 months ago

That sounds like a lot, but it could be due to low disk read throughput.

Have you checked that you are reusing your Timezonefinder instance? (Cf. https://github.com/jannikmi/timezonefinder/issues/176)

rosenbergj commented 11 months ago

Yeah, I don't know enough about Lambda to know why it's so much worse, but it makes sense that it might be more IO-bound than other environments when doing big reads like this. I am reusing my TimezoneFinder instance (or, more accurately, I'm only using it once).

jannikmi commented 11 months ago

At least you could make sure to reuse your instance between consequent lambda calls (e.g. as global variable).

Would you be willing to look into the improvement of the initialisation time?

rosenbergj commented 11 months ago

This is a very low traffic personal project, so using a global variable wouldn't help much (most executions are cold starts), and I don't want to use provisioned concurrency (even though it would only be a few dollars a month, the project currently costs about a penny per month). I'm not a skilled enough library developer to help with this issue, but I'd happily test any new versions.