stub42 / pytz

pytz Python historical timezone library and database
MIT License
353 stars 93 forks source link

Inefficient use of LazyList with pytz.timezone #88

Closed eendebakpt closed 1 year ago

eendebakpt commented 2 years ago

The import of pytz on my system is quite fast due to the use of lazy lists, but as soon as a single timezone is used the entire list is still processed. A minimal example:

import time

t0=time.perf_counter()
import pytz
dt=time.perf_counter()-t0; print(f'import {1e3*dt:.2f} ms')

zone='Europe/Berlin'
v=pytz.timezone(zone)
dt=time.perf_counter()-t0; print(f'timezone {1e3*dt:.2f} ms')

Has output:

import 14.33 ms
timezone 189.51 ms

The reason is the following line from __init__.py

all_timezones = LazyList((tz for tz in all_timezones if resource_exists(tz)))

The all_timezones is lazy, but the call to pytz.timezone ends up executing this piece of code

def _case_insensitive_zone_lookup(zone):
    """case-insensitively matching timezone, else return zone unchanged"""
    global _all_timezones_lower_to_standard
    if _all_timezones_lower_to_standard is None:
        _all_timezones_lower_to_standard = dict((tz.lower(), tz) for tz in all_timezones)  # noqa
    return _all_timezones_lower_to_standard.get(zone.lower()) or zone  # noqa

This results in the entire lazy list being processed.

Can the pytz code be refactored so that the call to pytz.timezone is more efficient?

@stub42