daviddrysdale / python-phonenumbers

Python port of Google's libphonenumber
Apache License 2.0
3.49k stars 413 forks source link

Dramatic import performance regression in Python 3.12+ when debugging #294

Open matthew-mcallister opened 5 months ago

matthew-mcallister commented 5 months ago

Python 3.12+ has a major performance regression when debugging, which affects this library (latest version, 8.13.34) particularly badly: https://github.com/python/cpython/issues/107674

Repro of issue:

# Run with python -m pdb repro.py
# Type n to execute the import with an active breakpoint.
from phonenumbers import geocoder
print('hi')

It takes multiple minutes to import the geocoder module in Python 3.12 and 3.13 when debugging due to the interpreter having to construct many very large dictionaries. This makes it impractical to run the debugger on projects that use this library under newer Python versions.

It's debatable whether this is an interpreter issue or a library issue, but the problem could be worked around by loading these large dicts from JSON files instead of importing them as Python files, as this would utilize the fast C-based JSON parser. Or these large dictionaries could be loaded as needed at runtime instead of all at once at import time.

daviddrysdale commented 5 months ago

Can you give a minimal repro scenario?

It's debatable whether this is an interpreter issue or a library issue

If this is only happening in Python 3.12+, and the relevant code in the library hasn't changed for over 10 years, it's probably not that debatable.

matthew-mcallister commented 5 months ago

Can you give a minimal repro scenario?

My bad, the repro was in the linked issue. I've updated the issue description.

If this is only happening in Python 3.12+, and the relevant code in the library hasn't changed for over 10 years, it's probably not that debatable.

Obviously this is a huge performance regression in CPython that is being actively being worked on. Hopefully there is a substantial improvement in 3.13 or even 3.14. That said, given the need to support the new sys.monitoring API, it is not entirely obvious (to me, a non-expert) when or by how much it will improve.

At the very least, having an issue open benefits libphonenumber users who discover this problem that is affecting their import speed and want to track its status.

daviddrysdale commented 5 months ago

My bad, the repro was in the linked issue. I've updated the issue description.

Thanks. It seems like the slow-down only occurs on single stepping with n, but doesn't occur with c ?

At the very least, having an issue open benefits libphonenumber users who discover this problem that is affecting their import speed and want to track its status.

Makes sense, I'll leave this open.