Closed glmnet closed 1 year ago
We have a fix for the error output (though I hadn't seen a bytearray before), but it's probably ending the line with \n
which is invalid. HTTP lines must be separated with \r\n
.
We're making the client side code a little more lenient, so it may start working again in the next release, though I think we'll need to enable another option for handling dodgy line breaks.
Actually, I can't get your actual response to parse, regardless of the lenient settings I enable...
The byte string for the response you listed is:
b"HTTP/1.0 200 OK\nServer: GoAhead-Webs\r\nPragma: no-cache\nCache-control: no-cache\nContent-Type: text/html\nSet-Cookie: userid=1691448870; path=/;\n\n<script language='JavaScript'>window.location='/admin/cable-Systeminfo.asp';</script></html>\n"
If I allow lenient CRLF handling, then I still get:
aiohttp.http_exceptions.BadHttpMessage: 400, message:
Invalid header token:
b"Pragma: no-cache\nCache-control: no-cache\nContent-Type: text/html\nSet-Cookie: userid=1691448870; path=/;\n\n<script language='JavaScript'>window.location='/admin/cable-Systeminfo.asp';</script></html>\n"
E ^
So, it still trips up after the headers. If possible, I'd suggest getting the server fixed to use \r\n
for line breaks.
For the no extensions version, I think it may be similar, as the reason should be OK
, but seems to end up with the entire message, suggesting that it again fails to parse it correctly without the correct linebreaks.
Probably, the server closed the connection because it has sent the full response, but the parser still thinks it's reading the HTTP status line and is waiting for the rest of the response.
Actually, llhttp is fixing it: https://github.com/nodejs/llhttp/issues/236 So, we can get this working again in a future release.
Thanks for taking care of this, IMHO making the parser stricter will yield in lots of reports like this. I support the cause though. I wish I could get in the middle there to fix the response before the parser handles it.
It's been made stricter due to security vulnerabilities. But, for the client side, we can allow some of the lenient options. We will still be enabling strict parsing when using dev mode (python -X dev
) though, to help developers find and fix broken HTTP responses.
Thanks. I'll close this when I find the parsing handling the line breaks gracefully
Currently, 9.0.1 works if I use \r\n\r\n
separating the body, but still fails with the exact response you posted.
I believe I have the same issue, I am doing requests to a domotic device implemented in a cgi by the manufacturer, I can get the response properly while using requests library but while using aiohttp it fails with the message:
err 400, message="Invalid header value char:\n\n b'Content-type: text/html'\n ^"
I guess this is likely related with message not being entire well formed?
Is there any workaround I can put in place?
I believe I have the same issue, I am doing requests to a domotic device implemented in a cgi by the manufacturer, I can get the response properly while using requests library but while using aiohttp it fails with the message:
err 400, message="Invalid header value char:\n\n b'Content-type: text/html'\n ^"
I guess this is likely related with message not being entire well formed?
Is there any workaround I can put in place?
This might give some hints, I believe it's because of a new line in the headers string?
$ curl -k -iv --raw .....
* ALPN: offers h2,http/1.1
* using HTTP/1.x
> POST /recepcion_datos_4.cgi HTTP/1.1
> User-Agent: curl/7.88.1
> Accept: */*
> accept-encoding: gzip,deflate
> Content-Length: 16
> Content-Type: application/x-www-form-urlencoded
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
HTTP/1.0 200 OK
< Content-type: text/html
Content-type: text/html
<
error_MODO_on_off=0
on_off=0
modo_operacion=0
modo_func=1
estado=0
consigna_potencia=3
consigna_temperatura=20.5
temperatura=24.2
temperatura_ext=---.-
* TLSv1.2 (IN), TLS alert, close notify (256):
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, close notify (256):
That's not the same, and I can't tell exactly from your output, but maybe because the server is erroneously using \n instead of \r\n. Possibly related to: https://github.com/nodejs/llhttp/issues/241
Sorry, mixing it up with another issue. That is the same issue, if that is the case. But, your output doesn't show the bytes, so I can't tell if it is using \r\n or just \n.
I believe I have the same issue, I am doing requests to a domotic device implemented in a cgi by the manufacturer, I can get the response properly while using requests library but while using aiohttp it fails with the message:
err 400, message="Invalid header value char:\n\n b'Content-type: text/html'\n ^"
I guess this is likely related with message not being entire well formed?
Is there any workaround I can put in place?
from the error message of the aiohttp exception above I believe it adds \n and empty strings after it. I have seen more people in homeassistant community having troubles with this. In our particular case we are mostly querying servers that run from firmwares unlikely to be updated. :/ Is there a way to workaround this? to make it less strict I guess?
I tried to downgrade the module version and test, it doesn't work 3.8.4
but it works with 3.8.3
We didn't touch the parser in 3.8.4: https://github.com/aio-libs/aiohttp/compare/v3.8.3...v3.8.4
We didn't touch the parser in 3.8.4: v3.8.3...v3.8.4
That is odd, it doesn't work with 3.8.5...3.8.4
, I am not very familiar with python ecosystem, what the way to provide more information? Above I just sent the error message from aiohttp
and the curl response.
Maybe if it works with AIOHTTP_NO_EXTENSIONS=1
(envvar), then you can just paste the result from resp.read()
. But, curl output is going to be interpreted by the terminal, so it's not helping here. We need the raw binary string in Python, or the hexcodes as in wireshark example above or something similar.
Same error here on win10. Downgrade to 3.8.3
fixed the issue.
This should be fine in master now, should have a new release shortly.
These kind of messages should parse now in 3.8.6. If you use Python dev mode though, it will go back to strict parsing, to help discover bugs in servers.
Thanks for the heads up, I've found this while coding for a tiny scrapper which runs inside Home Assistant, they bumped to 3.8.6 already but now they are beta-testing 3.9, anyway I won't be checking this until they release it which will be first Wednesday of November.
Tested this today on desktop, works great with 3.8.6. Thank you
Describe the bug
I had a script doing some scrapping from a cablemodem, the web server is dodgy but used to work, it started failing now
To Reproduce
You won't be able to reproduce as this is my own cablemodem, I'm providing the capture data below.
Expected behavior
Parse the response somehow
Logs/tracebacks
Python Version
aiohttp Version
multidict Version
yarl Version
OS
Windows 11
Related component
Client
Additional context
I'm not sure what the issue really is, the only odd thing to me is cablemodem answering HTTP/1.0 I don't know if this is to be expected.
Here is the request / response captured with Wireshark
looking for issues here, I found that this might be fixed by adding:
AIOHTTP_NO_EXTENSIONS=1
in that case exception changes to:
This might actually be another issue? For a quick research I found HTTP/1.0 will always close the connection after sending the response so ServerDisconnectedError should be "expected"?
Code of Conduct