Open ainamdar-ag opened 2 months ago
Thanks for the report. I agree that we need to resolve this.
Would you do this on module level or class (e.g. Works()) level?
I would implement it at Base class level, mainly to benefit cursors and repeated filters. Module level will be nice to have but not most effective, it might even add more complexity.
I'm not sure of the most common usage pattern but I imagine the classes are instantiated once only or few times.
My use case is mostly just paging through Works() with a specific filter and then using Authors() & Institutions() to fetch a list of the entities specified from Works list.
In addition, I think a way to disable SSL Verification for location execution would also help since many organisations have VPNs or proxies that use self-signed certificates. Maybe just pass-through verify=False
to session.
Question regarding the requests session. It seems that there is a new session created for each new request based on https://github.com/J535D165/pyalex/blob/main/pyalex/api.py#L96
It might be better that this session is generated once per entity (to keep it simple) otherwise ideally a singleton session might be nice.
Why am I asking? It seems from profiling the python code, a lot of CPU time is spent on SSL/TLS connection handshake and one suggestion on the web is to use connection pool or session with
requests
lib.