arthurdejong / python-stdnum

A Python library to provide functions to handle, parse and validate standard numbers.
https://arthurdejong.org/python-stdnum/
GNU Lesser General Public License v2.1
499 stars 208 forks source link

Resolve `get_soap_client` (and its callers?) more eagerly? #447

Closed xmo-odoo closed 2 weeks ago

xmo-odoo commented 2 months ago

I can take a crack at it if youre interested but I figured I'd ask first: while python caches modules it does not cache import errors (nor can it optimise away raise/except), so if the user has something other than a zeep with support for Transport and CachingClient then on every new (wsdlurl, timeout) get_soap_client has to go through the entire thing trying to import modules it doesn't have until it finally gets to the one it does have (or errors but I assume that only happens once).

As such I think it would be more efficient, and possibly more readable, to resolve the imports at import time and create different get_soap_client functions depending on which modules are available (possibly delegating that to a helper so the caching doesn't have to be reimplemented every time).

Also does stdnum still intend to preserve 2.7 compatibility? Python 2.7 was EOL'd in January 2020. That would allow removing the Python2-style getproxies case, and replacing the ad-hoc cache by a simpler functools.lru_cache (maybe optionally functools.cache).

arthurdejong commented 2 weeks ago

Hi @xmo-odoo,

Thanks for the offer. I would expect that there would not be that many combinations of (wsdlurl, timeout) where this would really be an issue.

I would like to avoid trying to import zeep at import time of the stdnum.util module because that module is loaded by almost all number format modules which means zeep (or whatever library) is imported even if you only want to do some simple validation which doesn't require SOAP. That is also why, in general, only system Python modules are imported at the top of the file and third-party imports are imported within functions.

Btw, the cache is not used if neither of Zeep, Suds or PySimpleSOAP is found but I highly doubt someone would want to check that repeatedly.

I plan to drop Python 2.7 support in the beginning of 2025 which should have given most organisations enough time to do the transition. I plan to remove the Python 2.7 compatibility code then. I'm also considering type hints (see #433).

Btw in 3fcebb2 I did some refactoring of the get_soap_client() function to make it a little simpler. I also dropped the case where CachingClient is not available because that has been available for a long time and even my old Python 2.7 systems have it.

But if you have suggestions for improvements feel free to make a PR. I honestly have very limited time available so reviewing/merging those kind of changes are on a best-effort basis.

xmo-odoo commented 2 weeks ago

I would like to avoid trying to import zeep at import time of the stdnum.util module because that module is loaded by almost all number format modules which means zeep

That does make sense.

Btw, the cache is not used if neither of Zeep, Suds or PySimpleSOAP is found but I highly doubt someone would want to check that repeatedly.

Sorry if I was not clear, by "cache" here I meant Python's own module cache. That is if zeep is not installed for instance then every get_soap_client call with a new timeout will re-try importing it, fail, and then fall back to suds.

This is a behaviour which normally makes sense, but when doing lazy conditional / fallback imports it can be less than great.

Btw in https://github.com/arthurdejong/python-stdnum/commit/3fcebb2961b0b84b6008a0b27990ace757872e10 I did some refactoring of the get_soap_client() function to make it a little simpler. I also dropped the case where CachingClient is not available because that has been available for a long time and even my old Python 2.7 systems have it.

Oh yeah I'd missed that. If the cases are already split out then it would make caching the import error much easier, although it may not be that useful if zeep is the only supported client anyway.