apify / got-scraping

HTTP client made for scraping based on got.
422 stars 32 forks source link

fix: process http over https proxy correctly #129

Closed barjin closed 3 months ago

barjin commented 5 months ago

Previously, HTTP request (http://example.com/resource) over HTTPS proxy (https://proxy.com) was sent as:

After a discussion with @jirimoravcik , we figured that the "pathname" proxies are a marginal thing (CONNECT is more popular nowadays).

This PR fixes that - got-scraping now does:


Fixes #126

barjin commented 4 months ago

Huh, so the description of this PR mentions the existence of "no-CONNECT" proxies. I'm not sure whether there is an easy way of determining whether a proxy supports the CONNECT method (an easy way other than a preflight CONNECT request).

barjin commented 3 months ago

As discussed w/ @B4nan, we'll go through with this - there is no repro to be found for the "no-CONNECT" proxies - worst case, we can always revert (and we'll have the actual repro by then :))