aio-libs / aiohttp

Asynchronous HTTP client/server framework for asyncio and Python
https://docs.aiohttp.org
Other
14.95k stars 1.99k forks source link

Clearing the cookie jar is unexpectedly expensive #8575

Open bdraco opened 1 month ago

bdraco commented 1 month ago

Describe the bug

clear_cookie_jar

To Reproduce

I think the trace source if python-kasa since it clear the cookie jar on every connection

Expected behavior

clear should be fast

Logs/tracebacks

n/a

Python Version

n/a

aiohttp Version

3.10.0

multidict Version

n/a

yarl Version

n/a

OS

Linux

Related component

Client

Additional context

No response

Code of Conduct

bdraco commented 1 month ago

Its the 350000 calls to .items()

bdraco commented 1 month ago

Not a regression but noticed it while investigating another issue

Dreamsorcerer commented 4 days ago

Is this because they use a predicate? It's not clear how much we could improve it if they want to use predicate..

bdraco commented 4 days ago

Yes they use a predicate. I usually stew on these type of issues for a bit until I come up with a better algorithm to reduce the overhead. Sometimes they never get solved but often it just takes me a long time to reach a solution.

bdraco commented 2 days ago

heapq is probably the solution... but those type of constructs are always a bit painful to get right

Dreamsorcerer commented 2 days ago

The predicate would still need to run against every morsel, right?

The only way I see of reducing that is to completely change the predicate API, so it can do something like skipping the entire cookie for a (domain. path) key without checking the individual morsels. It'd probably also need to have a good idea of whether the majority of cookies will be kept or deleted, if the latter then it can just store the ones to keep, then clear the cookiejar and readd them, instead of deleting all the other cookies one-by-one.