rubycdp / ferrum

Headless Chrome Ruby API
https://ferrum.rubycdp.com
MIT License
1.76k stars 127 forks source link

Adding Header using browserless 2 / empty user agent #491

Open bk-one opened 2 months ago

bk-one commented 2 months ago

I have a very simple rspec test that fails since I've updated to browserless 2 (via docker). The spec tries to set a header before sending the first request. I'm using Ruby 3.3 on Rails 7.1 with capybara and cuprite. Basically following the steps in https://evilmartians.com/chronicles/system-of-a-test-setting-up-end-to-end-rails-testing - but with updated browserless containers (from 1 to 2).

before do
  page.driver.add_headers({ 'X-App-Version' => '1.0.4(79)' })
end

This throws a cryptic:

eval error: Invalid parameters
  /usr/local/bundle/bundler/gems/ferrum-5ca5e9ed5e9a/lib/ferrum/client.rb:167:in `raise_browser_error'
  /usr/local/bundle/bundler/gems/ferrum-5ca5e9ed5e9a/lib/ferrum/client.rb:95:in `send_message'
  /usr/local/bundle/bundler/gems/ferrum-5ca5e9ed5e9a/lib/ferrum/client.rb:24:in `command'

After investigating a bit, I saw that the default-user-agent is nil when trying to set the header, hence the first Headers#set_overrides tries to set an empty UserAgent:

{:method=>"Network.setUserAgentOverride", :params=>{:userAgent=>nil, :id=>81, :sessionId=>"D244F2C83EED7F439C74A4EAFDF5D218"}

I did not investigate further, why the default-user-agent is empty in that case, adding a user agent like this explicitly solves the problem:

before do
  page.driver.browser.options.default_user_agent = 'Mozilla/5.0 (X11; Linux x86_64)'
  page.driver.add_headers({ 'X-App-Version' => '1.0.4(79)' })
end

It's probably worth noting, that browserless 2 does offer a user-agent when fetching the initial /json/version data. It does return the following json:

"Browser": "Chrome/129.0.6668.29",
"Protocol-Version": "1.3",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/129.0.0.0 Safari/537.36",
"V8-Version": "12.9.202.10",
"WebKit-Version": "537.36 (@0000000000000000000000000000000000000000)",
"webSocketDebuggerUrl": "ws://0.0.0.0:3000",
"Debugger-Version": "0000000000000000000000000000000000000000"

However, as browserless 2 always require a token to fetch the result, maybe that's the reason for the empty default-user-agent. I found these lines in the output of browserless on first requests:

2024-09-10 08:11:14   browserless.io:server:trace  Handling inbound HTTP request on "GET: /json/version" +0ms
2024-09-10 08:11:14   browserless.io:server:trace  Found matching HTTP route handler "/json/version?(/)" +4ms
2024-09-10 08:11:14   browserless.io:server:trace  Authorizing HTTP request to "/json/version" +0ms
2024-09-10 08:11:14   browserless.io:server:error  HTTP request is not properly authorized, responding with 401 +0ms

but that's just an assumption. The specs themselves work just fine, I'm initializing Cuprite/Ferrum only with a ws_url that includes the token.