abhinavsingh / proxy.py

šŸ’« Ngrok FRP Alternative ā€¢ āš” Fast ā€¢ šŸŖ¶ Lightweight ā€¢ 0ļøāƒ£ Dependency ā€¢ šŸ”Œ Pluggable ā€¢ šŸ˜ˆ TLS interception ā€¢ šŸ”’ DNS-over-HTTPS ā€¢ šŸ”„ Poor Man's VPN ā€¢ āŖ Reverse & ā© Forward ā€¢ šŸ‘®šŸæ "Proxy Server" framework ā€¢ šŸŒ "Web Server" framework ā€¢ āžµ āž¶ āž· āž  "PubSub" framework ā€¢ šŸ‘· "Work" acceptor & executor framework
https://abhinavsingh.com/proxy-py-a-lightweight-single-file-http-proxy-server-in-python/
BSD 3-Clause "New" or "Revised" License
2.91k stars 568 forks source link

request.total_size has an inconsistent value #1419

Closed LoickZoty closed 3 weeks ago

LoickZoty commented 3 weeks ago

Describe the bug File hosting on a cloud server: request.total_size has an inconsistent value.

To Reproduce Steps to reproduce the behavior:

  1. Create a custom log plugin:
    
    import logging
    from typing import Optional, Dict, Any

from proxy.http.proxy import HttpProxyBasePlugin

DEFAULT_HTTPS_PROXY_ACCESS_LOG_FORMAT = '{client_ip}:{client_port} - ' + \ '{request_method} {server_host}:{server_port} - ' + \ 'request_size: {request_bytes} bytes, response_size: {response_bytes} bytes - ' + \ '{connection_time_ms}ms'

logger = logging.getLogger(name)

class CustomLogPlugin(HttpProxyBasePlugin): def init(self, *args, *kwargs) -> None: super().init(args, **kwargs)

def on_access_log(self, context: Dict[str, Any]) -> Optional[Dict[str, Any]]:
    logger.info(DEFAULT_HTTPS_PROXY_ACCESS_LOG_FORMAT.format_map(context))
    return None
2. Run main:
```python
import proxy

if __name__ == '__main__':
    with proxy.Proxy(plugins=["plugin.custom_log_plugin.CustomLogPlugin"]) as p:
        proxy.sleep_loop(p)
  1. Configure the proxy on your browser to 127.0.0.1:8899
  2. Go to Google Drive / Swisstransfer and upload a file of more than a few KB.
  3. Watch the logs scroll by.

Expected behavior In my case, I use Google Drive, I host a 5MB file. The request goes through a PUT method with a Content-Length of 5242880. The logs do not mention any PUT request (only CONNECT) and the total_size never exceeds 300 bytes.

What is strange is that I reproduced the file hosting with a local server and the logs are consistent:

2024-06-09 11:37:27,041 - pid:12424 [I] custom_log_plugin.on_access_log:18 - 127.0.0.1:53886 - CONNECT myconnect.app:443 - request_size: 303 bytes, response_size: 5367 bytes - 10069.00ms
2024-06-09 11:37:27,203 - pid:6448 [I] custom_log_plugin.on_access_log:18 - 127.0.0.1:53895 - PUT localhost:8080 - request_size: 5245346 bytes, response_size: 92 bytes - 53.13ms

I do receive both requests: CONNECT & PUT.

Even stranger, when I download what I hosted on Google Drive, the response.total_size has a consistent value but it comes from a CONNECT request and not a GET.

2024-06-09 11:43:19,878 - pid:6448 [I] custom_log_plugin.on_access_log:18 - 127.0.0.1:54259 - CONNECT drive.usercontent.google.com:443 - request_size: 248 bytes, response_size: 5287813 bytes - 27611.80ms

Version information

abhinavsingh commented 3 weeks ago

proxy.py is working as expected. You need to understand how CONNECT (https) requests work for a proxy server, which will never see a PUT request going through it, but only the initial CONNECT request. ChatGPT can help and explain nowadays :)