Open JoeanAmier opened 3 months ago
Due to the unique nature of the project, it is not convenient for others to reproduce the bug. I am testing the TikTok download feature of the project and reproducing the bug 100%.
Could you atleast put a snippet of the code you used along with comments explaining what happened at different points? I'm rather struggling to understand your issue from the description.
@PrivateRetry.retry
async def request_file(
self,
url: str,
temp: Path,
actual: Path,
show: str,
id_: str,
count: SimpleNamespace,
progress: Progress,
headers: dict = None,
tiktok=False,
unknown_size=False,
semaphore: Semaphore = None,
) -> bool:
async with semaphore or self.semaphore:
try:
async with self.session.get(
url,
proxy=self.proxy_tiktok if tiktok else self.proxy,
headers=self.__adapter_headers(headers, tiktok, ), ) as response:
if not (
content := int(
response.headers.get(
'content-length',
0))) and not unknown_size: # 响应内容大小判断
self.log.warning(f"{url} 响应内容为空")
return False
if response.status > 400: # 响应码判断
self.log.warning(
f"{response.url} 响应码异常: {response.status}")
return False
elif all((self.max_size, content, content > self.max_size)): # 文件下载跳过判断
self.log.info(f"{show} 文件大小超出限制,跳过下载")
return True
return await self.download_file(
temp,
actual,
show,
id_,
response,
content,
count,
progress)
except ClientError as e:
self.log.warning(f"网络异常: {e}")
return False
The ClientSession in this location has encountered an exception. I used a new ClientSession object here to restore it to normal. My friend said that the parameter may not have been successfully passed. The code here is to download a file and requires a cookie. The normal response code is 206, and the incorrect cookie response code is 403. It is suspected that the headers were not successfully passed here.
I'm still not clear what the issue is.
I used a new ClientSession object here to restore it to normal.
The code you shared does not create a new ClientSession, it simply returns False after an exception.
I'm rather unclear what you want us to do. If the headers are wrong, that's not something we can help with...
Replacing aiohttp with https or requests can solve the problem.
Are you putting parameters in the URL? The most common difference between those libraries is that URLs are escaped by default. See (in particular, the note): https://docs.aiohttp.org/en/stable/client_quickstart.html#passing-parameters-in-urls
I used aiohttp in my project, and at the location where an exception occurred, I passed in the URL and headers parameters. I copied the URL and headers parameters and tested them using aiohttp code. The test passed, but I suspect that parameter passing failed. My friend has also experienced parameter passing failures, which is not a coding issue. If the headers parameter is not passed, the running result will be the same as this exception result.
I don't think there's any difference between these libraries regarding passing headers. You're either passing them or you're not. aiohttp isn't going to lose headers.
Testing the URL and headers parameters using aiohttp, httpx, and requests separately is normal, and aiohttp only experiences exceptions at a specific location in the project.
I recorded a video, including the location and results of the anomalies, as well as the results of individual tests.
If you think this anomaly is not related to aiohttp, I will delete the video.
If it's a problem with my code, I should get a response code 403 when testing using URL and headers, which is consistent with the response code at the exception location. However, the response code I tested was 206, indicating that it's not an exception in my code.
The URL in your video is a string with query parameters. Therefore, you probably need to pre-encode (or pass them using params
) as I mentioned in the previous comment: https://github.com/aio-libs/aiohttp/issues/8464#issuecomment-2178577282
The difference in behaviour could be the proxy? Maybe one proxy is actually decoding the URL before passing it through to the endpoint, while the other passes it through unchanged.
The URL and headers are both directly copied for testing, and the proxy is also set to http://127.0.0.1:10809 If it is a coding issue, the copied test results should be consistent with the abnormal results.
The code at the exception location and the parameters used in the test code are exactly the same, and the results should also be consistent, but in reality, they are not consistent. I don't know why the headers failed to pass, but as an aiohttp developer, you should have a better understanding.
Have you tried passing it as a pre-encoded URL as mentioned twice already? Without your code, I can't give any more suggestions than that..
I tried many methods, including encoding parameters, but couldn't solve them until I created a new ClientSession or replaced aiohttp with requests or httpx. When I tested, I did not perform any encoding on the URL, but the test results were normal. Isn't it enough to indicate that it's not an encoding issue?
When I tested, I did not perform any encoding on the URL, but the test results were normal. Isn't it enough to indicate that it's not an encoding issue?
Well, if you just use encoded=True
then we know for sure that it's not the URL encoding. Unfortunately, without being able to run your code, I have no further ideas. I can't remember anyone else reporting an issue like that which is solved by creating new sessions.. Are you definitely using sessions in httpx/requests as well?
Describe the bug
My project is a crawler program. The program's lifecycle only creates a ClientSession object, and all requests are initiated using this object. However, some requests may encounter exceptions and return incorrect response codes When I debugged, I used the url and headers separately for code testing, and the response code was normal. I found that the problem was with the ClientSession. When I used a new ClientSession object to initiate a request at the location where the error occurred, the response code was correct. Every time the program uses the ClientSession object to initiate a request, it passes in the url and headers, but I don't know why the response code exception only occurs at that location. Could it be that the ClientSession is contaminated? If I need to resolve this error, I may need to create two ClientSession objects for separate use. Do you have any better suggestions?
To Reproduce
Due to the unique nature of the project, it is not convenient for others to reproduce the bug. I am testing the TikTok download feature of the project and reproducing the bug 100%.
Expected behavior
I took out the URL and headers separately for testing, and the response code was 206, normal. When there was an exception, the response code was 403.
Logs/tracebacks
Python Version
aiohttp Version
multidict Version
yarl Version
OS
Windows 11
Related component
Client
Additional context
This is my project address: https://github.com/JoeanAmier/TikTokDownloader It is still under development
ClientSession
Code of Conduct