SeleniumHQ / selenium

A browser automation framework and ecosystem.
https://selenium.dev
Apache License 2.0
29.76k stars 8.02k forks source link

[🚀 Feature]: BiDi: Add low-level helper method to subscribe/unsubscribe from events #14201

Open Mr0grog opened 6 days ago

Mr0grog commented 6 days ago

Feature and motivation

I’ve started playing around with the BiDi support in the Ruby webdriver, and have noticed that subscribing to and unsubscribing from events is fairly complicated: you need to keep track of the event callbacks that have been added to the session (including by other actors) so you know whether to send a session.subscribe/session.unsubscribe command alongside adding/removing your callback.

For example, the Ruby LogHandler does this: https://github.com/SeleniumHQ/selenium/blob/164bf7944b6f7a343de792a5ee48f694286dac3c/rb/lib/selenium/webdriver/bidi/log_handler.rb#L44-L59

The Python source follows a similar pattern in Script: https://github.com/SeleniumHQ/selenium/blob/164bf7944b6f7a343de792a5ee48f694286dac3c/py/selenium/webdriver/common/bidi/script.py#L38-L52

I don’t know if this is a pattern that has crept in or a standard design/set of interfaces that are expected to exist across all the languages. Either way, having the subscriber need to keep track of who else might be subscribed seems like a problem — it adds a bit of complexity everywhere events are used and it’s easy to mess up if a user needs to listen to events that aren’t currently handled by a higher-level wrapper.

It would be really helpful if there were a subscribe or listen_to_event or some similar method (plus the corresponding unsubscribe method) on the low-level BiDi API that adds your callback and calls session.subscribe/session.unsubscribe as appropriate, e.g:

module Selenium
  module WebDriver
    class BiDi
      # This matches what `Script` does, but I imagine a more robust version of
      # this might also want to take an optional list of contexts and
      # subscribe/unsubscribe only the appropriate contexts. Then the handling
      # of callbacks might also filter based on what context the event is
      # coming from, rather than just hooking your callback directly up to the
      # @ws instance.
      def add_callback(event, &block)
        session.subscribe(event) if callbacks[event].empty?
        @ws.add_callback(event, &block)
      end

      def remove_callback(event, id)
        @ws.remove_callback(event, id)
        session.subscribe(event) if callbacks[event].empty?
      end

(BiDi seems like the obvious place to put this level of abstraction in Ruby, but the Python library currently has nothing sitting in-between the higher-level APIs like Script and the actual websocket connection. I have not looked at the other language packages.)

Usage example

Currently, subscribing/unsubscribing to BiDi events requires code like (this is Ruby, but it’s similar in Python and maybe other languages):

# Subscribe to navigation events:
driver_instance.bidi.session.subscribe('browsingContext.fragmentNavigated')
callback_id = driver_instance.bidi.add_callback('browsingContext.fragmentNavigated') do |params|
  # Do something with the event.
  puts "Fragment Navigated: #{params}"
end

# Do some work...

# Unsubscribe:
driver_instance.bidi.remove_callback('browsingContext.fragmentNavigated', callback_id)
driver_instance.bidi.session.unsubscribe('browsingContext.fragmentNavigated') if driver_instance.bidi.callbacks['browsingContext.fragmentNavigated'].empty?

Ideally, subscribing/unsubscribing would be one-liners without any conditionals:

callback_id = driver_instance.bidi.subscribe('browsingContext.fragmentNavigated') do |params|
  # Do something with the event.
  puts "Fragment Navigated: #{params}"
end

# Do some work...

# Unsubscribe:
driver_instance.bidi.unsubscribe('browsingContext.fragmentNavigated', callback_id)

Even nicer if allows filtering by context, but maybe not as critical?

callback_id = driver_instance.bidi.subscribe('browsingContext.fragmentNavigated', contexts: [a_context]) do |params|
  # Do something with the event.
  puts "Fragment Navigated: #{params}"
end

# Do some work...

# Unsubscribe (from which context is handled for you based on callback_id):
driver_instance.bidi.unsubscribe('browsingContext.fragmentNavigated', callback_id)
github-actions[bot] commented 6 days ago

@Mr0grog, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

titusfortner commented 5 days ago

I'm not sure I'm following your concern.

The BiDi classes are all private API (I thought I'd marked them as such, need to add that). The plan is to provide nice wrapper methods to access everything. That means we are assuming that the driver is the source of truth and storing state. Users should never need to call driver_instance.bidi, users should never need to know the names of the domains they are subscribing to. If they do something that needs that subscription, it happens for them.

Subscribing/unsubscribing happens automatically when you use the API:

id = driver.script.add_console_message_handler { |log| log_entries << log }
driver.script.remove_console_message_handler(id)

So we'll probably have something like (domains and method names tbd):

id = driver.navigation.add_handler { |event| log_navigation << event}
driver.navigation.remove_handler(id)
Mr0grog commented 5 days ago

Ah, I guess this was my fundamental misunderstanding:

The BiDi classes are all private API (I thought I'd marked them as such, need to add that). The plan is to provide nice wrapper methods to access everything.

I assumed the low-level API was meant to be available for cases the high-level stuff does not (yet) cover, or when new BiDi stuff eventually shows up in various drivers before high-level wrappers show up, or as a general escape hatch from the way something is handled in the high-level wrappers (calling the wrappers "high-level API" all over the place also implied to me that there is meant low-level API to drop down to in this way).

Seeing all the discussion around deprecating devtools, etc. (both here and from vendors, e.g. https://fxdx.dev/deprecating-cdp-support-in-firefox-embracing-the-future-with-webdriver-bidi/) suggested to me that I should be getting started with BiDi. Since there is hardly any high-level coverage, I thought I was supposed to be using the low-level stuff for now. Is the actual intent that I should just not use BiDi yet? (Excepting very narrow use cases that have been covered.)

titusfortner commented 5 days ago

low-level API was meant to be available for cases the high-level stuff does not (yet) cover

This is also true! We want the classes to be available if someone wants to dig in and do interesting things (and then tell us that their use case should have a high-level API to make it easier). So, with that in mind, if there is something we can make easier in the low-level API as we build it out, we should do so. We've only got one implementation so far, so I'm assuming there are going to be some additional abstractions we'll be able to make to share patterns, and I just don't know them, yet.

titusfortner commented 5 days ago

@Mr0grog would love to discuss more on Slack/Matrix/IRC - https://www.selenium.dev/support/#ChatRoom