apify / got-scraping

HTTP client made for scraping based on got.
422 stars 32 forks source link

Incorrect work on the http2 #113

Open rasimx opened 8 months ago

rasimx commented 8 months ago

It seems to me that the library does not work correctly over the http/2 when using a proxy. Each request create a new agent, which, in turn, create of a new connection. I tried to reuse an existing agent in Got, this significantly reduces the time to receive responses from the server.

const http2Agent = new Http2OverHttp(wrapperOptions)

// multiple requests
const response = await got(
    `https://some.site`,
    {
      http2: true,
      resolveBodyOnly: true,
      responseType: 'json',
      agent: {
        http2: http2Agent,
      },
    },
  );
vladfrangu commented 7 months ago

hi! Taking a look at this issue right now, do you have a reproduction sample we can use to try to debug this? 🙏

Looking at the code, got-scraping only overwrites the agents provided by you when you also set a proxyUrl. Also, what are the options you gave to the Http2OverHttp agent?

rasimx commented 7 months ago

Apparently, I expressed myself incorrectly. I meant, when using the proxyUrl option, got-scraping creates a new agent for each request. As a result, the new agent creates a new connection for each request. But this does not correspond to the concept of http2, when a new connection is created only at the first request, and all other requests use an existing connection. It should use an existing agent instead of creating a new one for each request. I gave the code above as an example of how I used an existing agent in got

tonybruess commented 2 months ago

Please see my comment which explains this issue: https://github.com/apify/got-scraping/issues/112#issuecomment-1969967596