bogdanfinn / tls-client

net/http.Client like HTTP Client with options to select specific client TLS Fingerprints to use for requests.
BSD 4-Clause "Original" or "Old" License
670 stars 133 forks source link

Is it possible to get RAW HTTP response instead of JSON response #27

Open GunGunGun opened 1 year ago

GunGunGun commented 1 year ago

Hi, first I want to say thanks for this awesome project! This project literally open up a door for my Python application to allow it to bypass many TLS Fingerprint websites.

From this Issue: https://github.com/FlorianREGAZ/Python-Tls-Client/issues/32

Currently tls-client sends back JSON data: {....,"headers":{"Access-Control-Allow-Credentials":["true"],"Access-Control-Allow-Origin":["https://rr6-sn-8qj-i5o6k.gooqlevideo.com"],"Age":["72715"],"Alt-Svc":["h3=\":443\"; ma=86400, h3-29=\":443\"; ma=86400"],"Cache-Control":["max-age=31536000"],"Cf-Cache-Status":["HIT"],"Cf-Ray":["79f028e9cf569e28-SIN"],"Content-Encoding":["br"],"Content-Security-Policy":["default-src 'self';base-uri 'self';block-all-mixed-content;font-src 'self' https: data:;form-action 'self';frame-ancestors 'self';img-src 'self' data:;object-src 'none';script-src 'self';script-src-attr 'none';style-src 'self' https: 'unsafe-inline';upgrade-insecure-requests"],"Content-Type":["text/html; charset=utf-8"],"Cross-Origin-Embedder-Policy":["require-corp"],"Cross-Origin-Opener-Policy":["same-origin"],"Date":["Sat, 25 Feb 2023 11:39:19 GMT"],"Expect-Ct":["max-age=0"],"Origin-Agent-Cluster":["?1"],"Referrer-Policy":["no-referrer"],"Server":["cloudflare"],"Strict-Transport-Security":["max-age=31536000; includeSubDomains; preload"],"Vary":["Accept-Encoding","Origin"],"X-Content-Type-Options":["nosniff"],"X-Dns-Prefetch-Control":["off"],"X-Download-Options":["noopen"],"X-Frame-Options":["SAMEORIGIN"],"X-Permitted-Cross-Domain-Policies":["none"],"X-Powered-By":["centminmod"],"X-Xss-Protection":["0","1; mode=block"]},"cookies":{"__cf_bm":"yxyOVhUi38n1UfQujprACZl5yxXyhYZMYl6boMzKKh0-1677324536-0-AdwLPV4avEeaFZNNLGA7VKzwHM3C2rbm2jL/SPwv1WNoRDaahAKij0hWwlApbK52ErrqT3pGG+pCYWIZlOzkM9o="}}

instead of RAW HTTP data like this:

HTTP/1.1 200 OK
Cache-Control: no-store
Pragma: no-cache
Date: Fri, 24 Feb 2023 18:51:16 GMT
Content-Type: text/html; charset=utf-8
Vary: Accept-Encoding
Vary: Origin
Content-Security-Policy: default-src * 'unsafe-inline' 'unsafe-eval' blob: data:;base-uri 'self';block-all-mixed-content;font-src 'self' https: data:;form-action 'self';frame-ancestors 'self';img-src * 'self' blob: data: 'unsafe-inline' 'unsafe-eval';object-src 'none';script-src * 'unsafe-inline' 'unsafe-eval' blob: data:;script-src-attr 'none';style-src * 'unsafe-inline' 'unsafe-eval' blob: data:;upgrade-insecure-requests;prefetch-src 'none';
Cross-Origin-Embedder-Policy: require-corp
Cross-Origin-Opener-Policy: same-origin
X-DNS-Prefetch-Control: off
Expect-CT: max-age=0
X-Frame-Options: SAMEORIGIN
X-Download-Options: noopen
X-Content-Type-Options: nosniff
Origin-Agent-Cluster: ?1
X-Permitted-Cross-Domain-Policies: none
Referrer-Policy: no-referrer
X-XSS-Protection: 0
X-XSS-Protection: 1; mode=block
Access-Control-Allow-Origin: https://rr6-sn-8qj-i5o6k.gooqlevideo.com
Access-Control-Allow-Credentials: true
X-Powered-By: centminmod
Old-Cache-Control: max-age=31536000
CF-Cache-Status: HIT
Age: 363739
Server: cloudflare
CF-RAY: 79ea644469696bc7-SIN
Old-Content-Encoding: br
Old-Transfer-Encoding: chunked
Content-Length: 778320

<long.................body data>

Which is not ideal for streaming data back to another client, and streaming data is very usueful in cases:

bogdanfinn commented 1 year ago

@GunGunGun you need the "original" http raw response right and only that one? So in a ideal world you would not receive the json response (which is some kind of parsed / enriched http response) when calling the shared lib function but the just the plain raw response. Which takes us again into memory leak issues and stuff like that ...

Maybe we can talk about your use-case you want to implement when having the requested raw response and i can try to build something for it directly in go ?

GunGunGun commented 1 year ago

Hi, sorry for the late reply! This is my answer and my use cases:

Yes, I want to get only raw response because in my case, I can parse response headers and response data pretty easily, and in Python's world it's possible to only read a small amount of packet data with socket.recv(1024), then parse header and leave the response body be. This is very useful in writting content filter proxy application for exampe to block response by content-type, content-length and reject the rest without downloading the whole response body. And there's some cases where you need to fetch 8GB file like video file or ISO file, not streaming data will cause out-of-memory (OOM) error in this case if user computer don't have enogh RAM to store 8GB+ of response data, and if it takes too long web browser will just reject the response because we don't send them anything to tell them the request is still alive (usually we send them small amount of chunk like 8191kb of data every second of milisecond to tell them, but if we just download the whole 8GB without telling them then that's not the case). Also in web browser's world and proxy's world, web browser can tell the proxy to abort the request, but if the proxy can't stream response then there's no way for the browser to check if the request is alive or dead, so 8GB of data will get downloaded, and then get rejected by the web browser and that's a huge waste of resource.

So if I can get RAW response by using something like this in Python:

# extract the exposed request function from the shared package
request = library.request
request.argtypes = [ctypes.c_char_p]
request.restype = ctypes.c_char_p

getCookiesFromSession = library.getCookiesFromSession
getCookiesFromSession.argtypes = [ctypes.c_char_p]
getCookiesFromSession.restype = ctypes.c_char_p

addCookiesToSession = library.addCookiesToSession
addCookiesToSession.argtypes = [ctypes.c_char_p]
addCookiesToSession.restype = ctypes.c_char_p

freeMemory = library.freeMemory
freeMemory.argtypes = [ctypes.c_char_p]

destroySession = library.destroySession
destroySession.argtypes = [ctypes.c_char_p]
destroySession.restype = ctypes.c_char_p

destroyAll = library.destroyAll
destroyAll.restype = ctypes.c_char_p

pl = {'followRedirects': False, 'forceHttp1': True, 'headers': headers, 'headerOrder': None, 'insecureSkipVerify': True, 'isByteRequest': True, 'isByteResponse': True, 'proxyUrl': '', 'requestUrl': url, 'requestMethod': varfrommain.command, 'requestBody': postdata, 'timeoutSeconds': 30, 'tlsClientIdentifier': 'firefox_106', 'withRandomTLSExtensionOrder': True, "withoutCookieJar": True, "withDebug": False}

r = request(json.dumps(pl).encode('utf-8')) //Connect to server
//Read response HTTP headers
firstfetch = r.read(1024)
data = firstfetch 
while not firstfetch.endswith('\r\n\r\n'):
    firstfetch = r.read(1024)
    data += firstfetch
//Parse headers and block request if match some criterias

//Stream content to web browser
while True:
    data = r.read(8192)
   if not data:
       break
   self.wfile.write(data) //at this step web browser and proxy can decide to reject the request or not or there's no pipe, pipe write error = abort