beeware / gbulb

GLib implementation of PEP 3156
Other
64 stars 25 forks source link

100% CPU usage when using httpx #117

Open haael opened 10 months ago

haael commented 10 months ago

Describe the bug

When doing (several) requests with httpx, the CPU usage goes beyond 100%, possibly proportional to the number of parallel requests. After the last request is finished, CPU usage remains at 100%. The requests are successful otherwise and the data gets delivered.

The problem does not occur when using httpx outside gbulb.

Steps to reproduce

import httpx
def main(url):
    client = httpx.AsyncClient()
    result = await client.get(url)
    print(result.content)
    await client.aclose()

Expected behavior

CPU usage at 0% when idle.

Screenshots

No response

Environment

Ubuntu, Python 3.11

Logs

Additional context

No response

freakboy3742 commented 10 months ago

Thanks for the report - unfortunately, I can't reproduce this failure.

Firstly, as with #116, it looks like you've trimmed a little too much from your reproduction example - it can't work as written because main isn't declared async. I had to modify it to read:

import asyncio
import httpx

async def main(url):
    client = httpx.AsyncClient()
    result = await client.get(url)
    print(result.content)
    await client.aclose()
    await asyncio.sleep(100)

if __name__ == "__main__":
    asyncio.run(main("https://www.wikipedia.org/"))

(the sleep at the end of the method is to expose the CPU load after the HTTP run)

Even then, on Python 3.11 on Fedora 38, it works fine for me on multiple runs.

The one detail you didn't include in your bug report was the version of gbulb you are using; is it possible you're running an older gbulb version that didn't have Python 3.11 support?

freakboy3742 commented 10 months ago

Note to self: don't review tickets before the coffee kicks in. I just realised I didn't test this against gbulb - I was just running it a virtual environment that had gbulb installed.

Apologies for the confusion; I'll take a second look.

freakboy3742 commented 10 months ago

Ok - I can now confirm I can reproduce this. I see it on both Python3.10 and Python3.11.

The test case is:

import asyncio
import httpx
import gbulb

async def main(url):
    client = httpx.AsyncClient()
    result = await client.get(url)
    print(result.content)
    await client.aclose()
    await asyncio.sleep(100)

if __name__ == "__main__":
    gbulb.install()
    asyncio.run(main("https://www.wikipedia.org/"))
freakboy3742 commented 9 months ago

See also #45.

freakboy3742 commented 5 months ago

I've finally found some time to investigate this; the issue appears to emerge as a result of closing the connection. If you remove the await client.close() call from the sample code above, the CPU doesn't thrash.

Sun-ZhenXing commented 2 months ago

I encountered the same issue on Windows and did not use asynchronous APIs. This occurs when a request is large and takes a long time. Only appears in Python 3.11 and above.

So this may not be an issue with asyncio, it could be related to the implementation of httpx?

freakboy3742 commented 2 months ago

Interesting... I guess it's possible it could be httpx... (or httpx's usage of asyncio). However, I've been able to reproduce the problem without the request being large or slow - wikipedia.org (which is the test case here) isn't that big, and it responds fairly quickly.

The good news is that at least as far as GBulb is concerned, a fix is on the horizon: PyGObject has just merged asyncio support, so as of the next PyGobject release, gbulb won't be required any more - and in my previous testing, the CPU usage problem doesn't exist with the native PyGObject asyncio integration.