sonic182 / aiosonic

A very fast Python asyncio http client
https://aiosonic.readthedocs.io/en/latest/
MIT License
152 stars 19 forks source link

sometime throws HttpParsingError #262

Closed Sunda001 closed 3 years ago

Sunda001 commented 3 years ago

so i do a request to an API but it sometimes raise these errors:

2021-06-27T12:02:51.478515+00:00 app[worker.1]:     return await self.request(url,
2021-06-27T12:02:51.478515+00:00 app[worker.1]:   File "/app/.heroku/python/lib/python3.9/site-packages/aiosonic/__init__.py", line 738, in request
2021-06-27T12:02:51.478516+00:00 app[worker.1]:     response = await wait_for(
2021-06-27T12:02:51.478516+00:00 app[worker.1]:   File "/app/.heroku/python/lib/python3.9/asyncio/tasks.py", line 481, in wait_for
2021-06-27T12:02:51.478516+00:00 app[worker.1]:     return fut.result()
2021-06-27T12:02:51.478516+00:00 app[worker.1]:   File "/app/.heroku/python/lib/python3.9/site-packages/aiosonic/__init__.py", line 433, in _do_request
2021-06-27T12:02:51.478518+00:00 app[worker.1]:     async with (await connector.acquire(*args)) as connection:
2021-06-27T12:02:51.478518+00:00 app[worker.1]:   File "/app/.heroku/python/lib/python3.9/site-packages/aiosonic/connectors.py", line 59, in acquire
2021-06-27T12:02:51.478518+00:00 app[worker.1]:     raise HttpParsingError('missing hostname')
2021-06-27T12:02:51.478518+00:00 app[worker.1]: aiosonic.exceptions.HttpParsingError: missing hostname

aiosonic version: 0.10.1

sonic182 commented 3 years ago

missing hostname exception mmm... when parsing url it is not getting hostname, it parses url with this function https://github.com/sonic182/aiosonic/blob/a04f611da2a584fcbaac9fd8b7ebab2e1d1c7437/aiosonic/__init__.py#L71 can you check the url you're using to call the request?

Sunda001 commented 3 years ago

hmm sometimes it works fine but sometimes it throws that errors, this kind of errors is always coming in aiosonic 0.9.7 but its not coming frequently after i upgraded it to the latest version, but still that kind of errors is coming sometimes

Sunda001 commented 3 years ago

the params of url is containing '[]' like this: params={'foo': '[this]'} maybe it could be the issues?

sonic182 commented 3 years ago

I don't think so, gonna add some debug logs for next version (whenever I got time) so it is easier to debug this kind of errors.

Try to log the arguments you're using to call the request please :)

Sunda001 commented 3 years ago

what do you mean by 'log the arguments' ?

sonic182 commented 3 years ago

def something():
    url = "https://postman-echo.com/post"
    posted_data = {'foo': 'bar'}
    # log stuffs
    logger.info(f'url={url} posted_data={str(posted_data)}')
    response = await client.post(url, data=posted_data)
Sunda001 commented 3 years ago

ah okay

Sunda001 commented 3 years ago

now i get it again... here is the full traceback and log of the url

url=https://vm.tiktok.com/ZSJVWrMAB/
Traceback (most recent call last):
  File "/app/tiktokDL.py", line 98, in _redr
    res = await tt_tools.request.get(url, headers={'User-Agent': self.user_agent})
  File "/app/.heroku/python/lib/python3.9/site-packages/aiosonic/__init__.py", line 563, in get
    return await self.request(url,
  File "/app/.heroku/python/lib/python3.9/site-packages/aiosonic/__init__.py", line 743, in request
    response = await wait_for(
  File "/app/.heroku/python/lib/python3.9/asyncio/tasks.py", line 481, in wait_for
    return fut.result()
  File "/app/.heroku/python/lib/python3.9/site-packages/aiosonic/__init__.py", line 459, in _do_request
    response._set_response_initial(
  File "/app/.heroku/python/lib/python3.9/site-packages/aiosonic/__init__.py", line 158, in _set_response_initial
    raise HttpParsingError('response line parsing error')
aiosonic.exceptions.HttpParsingError: response line parsing error

but for the second try it works, but sometimes it throws that errors again

aiosonic version: 0.11.0

sonic182 commented 3 years ago

Mmm something weird, gonna improve that exception next release

This sample code works, with the url you have in the backtrace, you can try add sentry to your project, it may show the variables values in those steps of the error.

import asyncio
from aiosonic import HTTPClient

async def main():

    async with HTTPClient() as client:
        res = await client.get('https://vm.tiktok.com/ZSJVWrMAB/', follow=True)
        print(res.status_code, await res.text())

asyncio.run(main())
Sunda001 commented 3 years ago

what if i am using it on the class? can i do this instead of using context manager on every function

class Foo:
    def __init__(self):
        self._request = HTTPClient(verify_ssl=False)

    async def home(self):
        await self._request.get(...)

    async def timeline(self):
        await self._request.get(...)

    async def main(self):
        await self.home()
        await self.timeline()

    async def __aenter__(self):
        return self

    async def __aexit__(self, *args):
        await self._request.shutdown()
sonic182 commented 3 years ago

That way you don't take advantage of keep alive feature, better if you use one instance of HTTPClient for all requests, you could wrap it in a singleton class or just a global variable, however fits better to your code

sonic182 commented 3 years ago

@Sunda001 you can try 0.11.1 and activate debug log https://aiosonic.readthedocs.io/en/latest/examples.html#debug-log if you have not too much traffic in your app

I did improve the aiosonic.exceptions.HttpParsingError to it display the data read by the socket that is bad then raising that exception.

Sunda001 commented 3 years ago

full traceback:

url=https://vm.tiktok.com/ZSJqYw3rv/
Traceback (most recent call last):
  File "/app/tiktokDL.py", line 98, in _redr
    res = await tt_tools.request.get(url, headers={'User-Agent': self.user_agent})
  File "/app/.heroku/python/lib/python3.9/site-packages/aiosonic/__init__.py", line 572, in get
    return await self.request(url,
  File "/app/.heroku/python/lib/python3.9/site-packages/aiosonic/__init__.py", line 752, in request
    response = await wait_for(
  File "/app/.heroku/python/lib/python3.9/asyncio/tasks.py", line 481, in wait_for
    return fut.result()
  File "/app/.heroku/python/lib/python3.9/site-packages/aiosonic/__init__.py", line 468, in _do_request
    response._set_response_initial(
  File "/app/.heroku/python/lib/python3.9/site-packages/aiosonic/__init__.py", line 143, in _set_response_initial
    raise HttpParsingError(f'response line parsing error: {data}')
aiosonic.exceptions.HttpParsingError: response line parsing error: b''

logging debug:


2021-07-01T06:45:04.442842+00:00 app[worker.1]: GET /ZSJqYw3rv/ HTTP/1.1
2021-07-01T06:45:04.442849+00:00 app[worker.1]: HOST: vm.tiktok.com
2021-07-01T06:45:04.442850+00:00 app[worker.1]: Connection: keep-alive
2021-07-01T06:45:04.442850+00:00 app[worker.1]: User-Agent: aioload/0.11.1
2021-07-01T06:45:04.442851+00:00 app[worker.1]: User-Agent: TikTok 16.0.16 rv:103005 (iPhone; iOS 11.1.4; en_EN) Cronet
2021-07-01T06:45:04.442852+00:00 app[worker.1]: ---

13:45:04,442 aiosonic DEBUG GET /ZSJqYw3rv/ HTTP/1.1
HOST: vm.tiktok.com
Connection: keep-alive
User-Agent: aioload/0.11.1
User-Agent: TikTok 16.0.16 rv:103005 (iPhone; iOS 11.1.4; en_EN) Cronet
---
sonic182 commented 3 years ago

seems that it is not reading from the socket any data when it should get the first status line. The only weird stuff is the user agent header repeated, seems that It is concatenating it instead of replacing the User-Agent header. Gonna fix that whenever I get time

sonic182 commented 3 years ago

Please @Sunda001 can you try with 0.11.2?

Thanks!

sonic182 commented 3 years ago

The real fix should be in 0.11.3, not in 0.11.12 CC @Sunda001

sonic182 commented 3 years ago

Please @Sunda001 if you get any error with aiosonic>=0.11.3, please don't hesitate to comment here again, I'm gonna close the issue for now