rubycdp / ferrum

Headless Chrome Ruby API
https://ferrum.rubycdp.com
MIT License
1.7k stars 122 forks source link

`Ferrum::Network::Response#loaded?` fails on `302` redirects after #309 #321

Closed dmorgan-fa closed 1 year ago

dmorgan-fa commented 1 year ago

Hi, Firstly, thanks for the excellent work on this gem and Cuprite!

We've been looking into some reliability issues lately and trying different things. One thing that I've noticed, is a recent change:

now causes redirect requests to be considered still pending even though they've completed:

Ferrum::TimeoutError:
         Timed out waiting for response. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the :timeout option to a higher value might help.
         Connections still pending:
          {:status=>302, :status_text=>"Found", :url=>"http://practice1.lvh.me:9888/practice_dashboard/companies/895397944/notes/1"}
Full Ferrum::Network::Exchange object here ``` #"http://subdomain000001.lvh.me:9887/overview", "method"=>"GET", "headers"=>{ "Accept"=>"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9", "Cookie"=>"[REDACTED]", "Referer"=>"http://subdomain000001.lvh.me:9887/", "Upgrade-Insecure-Requests"=>"1", "User-Agent"=>"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/108.0.5359.124 Safari/537.36" }, "initialPriority"=>"VeryHigh", "referrerPolicy"=>"origin" } > @request= #"E711BF6D793E1D3EC660F87242702CEF", "loaderId"=>"E711BF6D793E1D3EC660F87242702CEF", "documentURL"=>"http://subdomain000001.lvh.me:9887/overview", "request"=>{ "url"=>"http://subdomain000001.lvh.me:9887/overview", "method"=>"GET", "headers"=>{ "Content-Type"=>"application/x-www-form-urlencoded", "Origin"=>"http://subdomain000001.lvh.me:9887", "Referer"=>"http://subdomain000001.lvh.me:9887/", "Upgrade-Insecure-Requests"=>"1", "User-Agent"=>"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/108.0.5359.124 Safari/537.36" }, "mixedContentType"=>"none", "initialPriority"=>"VeryHigh", "referrerPolicy"=>"origin", "isSameSite"=>true }, "timestamp"=>122384.896241, "wallTime"=>1671721817.864675, "initiator"=>{ "type"=>"other" }, "redirectHasExtraInfo"=>true, "redirectResponse"=>{ "url"=>"http://subdomain000001.lvh.me:9887/sessions", "status"=>302, "statusText"=>"Found", "headers"=>{ "Cache-Control"=>"no-store", "Content-Length"=>"206", "Content-Type"=>"text/html; charset=utf-8", "Feature-Policy"=>"accelerometer 'none'; autoplay 'none'; ambient-light-sensor 'none'; camera 'none'; encrypted-media 'none'; fullscreen 'self'; geolocation 'none'; gyroscope 'none'; magnetometer 'none'; microphone 'none'; midi 'none'; payment 'none'; picture-in-picture 'none'; speaker 'none'; usb 'none'; vibrate 'none'; vr 'none'", "Location"=>"http://subdomain000001.lvh.me:9887/overview", "Pragma"=>"no-cache", "X-Frame-Options"=>"SAMEORIGIN", "X-Host"=>"mac", "X-Request-Id"=>"7fe4438c-e2da-4846-b9e1-57bbfa40344d", "X-Rev"=>"2aad5e7960c77fe0e405880b0974b0cda2c3a57c", "X-Runtime"=>"0.373335", "X-XSS-Protection"=>"1; mode=block" }, "mimeType"=>"text/html", "connectionReused"=>true, "connectionId"=>36, "remoteIPAddress"=>"127.0.0.1", "remotePort"=>9887, "fromDiskCache"=>false, "fromServiceWorker"=>false, "fromPrefetchCache"=>false, "encodedDataLength"=>1630, "timing"=>{ "requestTime"=>122384.433822, "proxyStart"=>-1, "proxyEnd"=>-1, "dnsStart"=>-1, "dnsEnd"=>-1, "connectStart"=>-1, "connectEnd"=>-1, "sslStart"=>-1, "sslEnd"=>-1, "workerStart"=>-1, "workerReady"=>-1, "workerFetchStart"=>-1, "workerRespondWithSettled"=>-1, "sendStart"=>0.197, "sendEnd"=>0.256, "pushStart"=>0, "pushEnd"=>0, "receiveHeadersEnd"=>461.364 }, "responseTime"=>1671721817863.499, "protocol"=>"http/1.1", "alternateProtocolUsage"=>"unspecifiedReason", "securityState"=>"insecure" }, "type"=>"Document", "frameId"=>"E80A9391E528438362D111598AAB1C2B", "hasUserGesture"=>true }, @request={ "url"=>"http://subdomain000001.lvh.me:9887/overview", "method"=>"GET", "headers"=>{ "Content-Type"=>"application/x-www-form-urlencoded", "Origin"=>"http://subdomain000001.lvh.me:9887", "Referer"=>"http://subdomain000001.lvh.me:9887/", "Upgrade-Insecure-Requests"=>"1", "User-Agent"=>"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/108.0.5359.124 Safari/537.36" }, "mixedContentType"=>"none", "initialPriority"=>"VeryHigh", "referrerPolicy"=>"origin", "isSameSite"=>true } > @response= #"E711BF6D793E1D3EC660F87242702CEF", "loaderId"=>"E711BF6D793E1D3EC660F87242702CEF", "documentURL"=>"http://subdomain000001.lvh.me:9887/setup", "request"=>{ "url"=>"http://subdomain000001.lvh.me:9887/setup", "method"=>"GET", "headers"=>{"Content-Type"=>"application/x-www-form-urlencoded", "Origin"=>"http://subdomain000001.lvh.me:9887", "Referer"=>"http://subdomain000001.lvh.me:9887/", "Upgrade-Insecure-Requests"=>"1", "User-Agent"=>"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML,like Gecko) HeadlessChrome/108.0.5359.124 Safari/537.36"}, "mixedContentType"=>"none", "initialPriority"=>"VeryHigh", "referrerPolicy"=>"origin", "isSameSite"=>true }, "timestamp"=>122385.173023, "wallTime"=>1671721818.141461, "initiator"=>{ "type"=>"other" }, "redirectHasExtraInfo"=>true, "redirectResponse"=>{ "url"=>"http://subdomain000001.lvh.me:9887/overview", "status"=>302, "statusText"=>"Found", "headers"=>{ "Cache-Control"=>"no-store", "Content-Length"=>"203", "Content-Type"=>"text/html; charset=utf-8", "Feature-Policy"=>"accelerometer 'none'; autoplay 'none'; ambient-light-sensor 'none'; camera 'none'; encrypted-media 'none'; fullscreen 'self'; geolocation 'none'; gyroscope 'none'; magnetometer 'none'; microphone 'none'; midi 'none'; payment 'none'; picture-in-picture 'none'; speaker 'none'; usb 'none'; vibrate 'none'; vr 'none'", "Location"=>"http://subdomain000001.lvh.me:9887/setup", "Pragma"=>"no-cache", "X-Frame-Options"=>"SAMEORIGIN", "X-Host"=>"mac", "X-Request-Id"=>"fc59edb4-4158-48bc-b64e-761dc432ef25", "X-Rev"=>"2aad5e7960c77fe0e405880b0974b0cda2c3a57c", "X-Runtime"=>"0.189034", "X-XSS-Protection"=>"1; mode=block" }, "mimeType"=>"text/html", "connectionReused"=>true, "connectionId"=>36, "remoteIPAddress"=>"127.0.0.1", "remotePort"=>9887, "fromDiskCache"=>false, "fromServiceWorker"=>false, "fromPrefetchCache"=>false, "encodedDataLength"=>1761, "timing"=>{ "requestTime"=>122384.896961, "proxyStart"=>-1, "proxyEnd"=>-1, "dnsStart"=>-1, "dnsEnd"=>-1, "connectStart"=>-1, "connectEnd"=>-1, "sslStart"=>-1, "sslEnd"=>-1, "workerStart"=>-1, "workerReady"=>-1, "workerFetchStart"=>-1, "workerRespondWithSettled"=>-1, "sendStart"=>0.146, "sendEnd"=>0.185, "pushStart"=>0, "pushEnd"=>0, "receiveHeadersEnd"=>275.388 }, "responseTime"=>1671721818140.693, "protocol"=>"http/1.1", "alternateProtocolUsage"=>"unspecifiedReason", "securityState"=>"insecure" }, "type"=>"Document", "frameId"=>"E80A9391E528438362D111598AAB1C2B", "hasUserGesture"=>true } @response={ "url"=>"http://subdomain000001.lvh.me:9887/overview", "status"=>302, "statusText"=>"Found", "headers"=>{ "Cache-Control"=>"no-store", "Content-Length"=>"203", "Content-Type"=>"text/html; charset=utf-8", "Feature-Policy"=>"accelerometer 'none'; autoplay 'none'; ambient-light-sensor 'none'; camera 'none'; encrypted-media 'none'; fullscreen 'self'; geolocation 'none'; gyroscope 'none'; magnetometer 'none'; microphone 'none'; midi 'none'; payment 'none'; picture-in-picture 'none'; speaker 'none'; usb 'none'; vibrate 'none'; vr 'none'", "Location"=>"http://subdomain000001.lvh.me:9887/setup", "Pragma"=>"no-cache", "X-Frame-Options"=>"SAMEORIGIN", "X-Host"=>"mac", "X-Request-Id"=>"fc59edb4-4158-48bc-b64e-761dc432ef25", "X-Rev"=>"2aad5e7960c77fe0e405880b0974b0cda2c3a57c", "X-Runtime"=>"0.189034", "X-XSS-Protection"=>"1; mode=block" }, "mimeType"=>"text/html", "connectionReused"=>true, "connectionId"=>36, "remoteIPAddress"=>"127.0.0.1", "remotePort"=>9887, "fromDiskCache"=>false, "fromServiceWorker"=>false, "fromPrefetchCache"=>false, "encodedDataLength"=>1761, "timing"=>{ "requestTime"=>122384.896961, "proxyStart"=>-1, "proxyEnd"=>-1, "dnsStart"=>-1, "dnsEnd"=>-1, "connectStart"=>-1, "connectEnd"=>-1, "sslStart"=>-1, "sslEnd"=>-1, "workerStart"=>-1, "workerReady"=>-1, "workerFetchStart"=>-1, "workerRespondWithSettled"=>-1, "sendStart"=>0.146, "sendEnd"=>0.185, "pushStart"=>0, "pushEnd"=>0, "receiveHeadersEnd"=>275.388 }, "responseTime"=>1671721818140.693, "protocol"=>"http/1.1", "alternateProtocolUsage"=>"unspecifiedReason", "securityState"=>"insecure" } > @error=nil > ```

This output was from a forked branch with some extra logging:

Admittedly this is only seen when we're calling Cuprite's wait_for_network_idle directly within a spec (something we're looking to remove), but it looks like something that will trip Ferrum up if wait_for_idle gets called and there's been a redirect.

route commented 1 year ago

Oh I see, could you please try to add check for a redirect status here, something like:

    def subscribe_response_received
      @page.on("Network.responseReceived") do |params|
        exchange = select(params["requestId"]).last

        if exchange
          response = Network::Response.new(@page, params)
          exchange.response = response
          response.loaded = true if response.status is redirect # <<< HERE
        end
      end
    end

and tell me if it works?

dmorgan-fa commented 1 year ago

Thanks for the quick reply! Unfortunately that doesn't seem to resolve it. Looks like we're stuck checking loaded?:

def loaded?
  if @loaded.nil?
    puts params['request']['url'] if params['request']
    puts params['redirectResponse']['status'] if params['redirectResponse']
  end
  @loaded
end
Output ``` Capybara starting Puma... * Version 6.0.0 , codename: Sunflower * Min threads: 0, max threads: 4 * Listening on http://0.0.0.0:9887 http://subdomain000001.lvh.me:9887/overview 302 http://subdomain000001.lvh.me:9887/setup 302 http://subdomain000001.lvh.me:9887/setup/accounting_dates 302 http://subdomain000001.lvh.me:9887/setup/vat_details 302 http://subdomain000001.lvh.me:9887/overview 302 http://subdomain000001.lvh.me:9887/setup 302 http://subdomain000001.lvh.me:9887/setup/accounting_dates 302 http://subdomain000001.lvh.me:9887/setup/vat_details 302 http://subdomain000001.lvh.me:9887/overview 302 http://subdomain000001.lvh.me:9887/setup 302 [...] # And on until it hits the timeout ```

A little more debugging:

def subscribe_response_received
  @page.on("Network.responseReceived") do |params|
    exchange = select(params["requestId"]).last

    if exchange
      response = Network::Response.new(@page, params)
      exchange.response = response
      puts "#{response.status} - #{response.url}"
      binding.pry if response.status != 200
      binding.pry if params['redirectResponse']['status'] == 302
      response.loaded = true if response.status is redirect
    end
  end
end
Output ``` [...] 200 - http://subdomain000001.lvh.me:9887/assets/talaria/source_sans/WOFF/OTF/SourceSansPro-Semibold.otf-e925383d18cb0ed94addee8444adf985ca8556fa4f3a0952088b57ab97c74d1f.woff 200 - http://subdomain000001.lvh.me:9887/assets/application/setup-clouds-1dfc5d93caf668f1ab11ada5ec2c2319d24475530d5b43480e53d5579d5b12d0.gif 200 - http://subdomain000001.lvh.me:9887/assets/application/setup-illustrations-1-ffa28978d62225e2e17eb6656e54ec46ab48fad94c95a8b6c86e84bdaa53d5c7.png 200 - http://subdomain000001.lvh.me:9887/assets/application/setup-illustrations-2-940cf4b3edf82e13955652ab0bb4a7a16e4936a899815d626ef4141c18fa873e.png 200 - http://subdomain000001.lvh.me:9887/assets/application/setup-illustrations-4-22726e5e0bca7e7901c55fb8897e34facffca81341456c8578bd58fa7bd952d8.png 200 - http://subdomain000001.lvh.me:9887/assets/application/setup-illustrations-5-89e61798f93d1a98e03132e94a9598db1e9847908be87dbdd31993b10f116059.png 200 - http://subdomain000001.lvh.me:9887/assets/talaria/components/icons/selectArrow-3fe18e3b8316bd20f03a1b32bc39a2bdda901fce760b51700316f5ccc60e0f91.svg 200 - http://subdomain000001.lvh.me:9887/assets/talaria/components/icons/cal-ecb2f5f5c59aab84c061122adbcc6f72d394397eb5cf816561fcbbb075832092.svg 200 - http://subdomain000001.lvh.me:9887/assets/talaria/components/icons/dot-42ac9b055cfef7d81dd982c462e5fc611e30f6b4248a47a1ab03832192e3b812.svg http://subdomain000001.lvh.me:9887/overview 302 http://subdomain000001.lvh.me:9887/setup 302 http://subdomain000001.lvh.me:9887/setup/accounting_dates 302 http://subdomain000001.lvh.me:9887/setup/vat_details 302 http://subdomain000001.lvh.me:9887/overview [...] # And on until it hits the timeout ```
francisbeaudoin commented 1 year ago

@dmorgan-fa As you already worked on this, is the proposed fix in https://github.com/rubycdp/ferrum/pull/328 making sense?

dmorgan-fa commented 1 year ago

@francisbeaudoin, sorry for the delayed response. I've just tested this and it does resolve the issue we were seeing, thanks! We've since moved away from explicitly calling Cuprite's wait_for_network_idle, but it's a good addition to fix the Ferrum library 👍