SeleniumHQ / selenium

A browser automation framework and ecosystem.
https://selenium.dev
Apache License 2.0
30.72k stars 8.19k forks source link

[dotnet] [bidi] Earlier preview feedback gathering #14530

Open nvborisenko opened 1 month ago

nvborisenko commented 1 month ago

Feature and motivation

Here we are going to gather everything related to BiDi implementation in .NET and improve it as soon as possible despite on any potential breaking changes.

1. Discriminated unions ✅ v4.26

We have a lot of classes what are inherited from a base. The basic example:

public abstract record ClipRectangle;

public record BoxClipRectangle(double X, double Y, double Width, double Height) : ClipRectangle;

public record ElementClipRectangle(Script.SharedReference Element) : ClipRectangle;

ClipRecatange is used like:

var screenshot = await context.CaptureScreenshotAsync(....); // it takes ClipRectangle as argument

And it is not clear what exactly I can put as arguments. I see base class as an argument, but I don't see what available options I can provide. We can add factory:

var screenshot = await context.CaptureScreenshotAsync(ClipRectangle.Box(5, 5, 10, 10));

But it requires to write so many boilerplate code from selenium team. Don't forget about optional parameters (which requires new class definition). So many code.

Solution

Use nested classes for all discriminated classes.

public abstract record ClipRectangle {
  public record Box(double X, double Y, double Width, double Height) : ClipRectangle;

  public record Element(Script.SharedReference Element) : ClipRectangle;
}

So user will be able to:

var screenshot = await context.CaptureScreenshotAsync(new ClipRectangle.Box(5, 5, 10, 10));

For user it seems there is no big diff, but for selenium team it is HUGE diff. And when https://github.com/dotnet/csharplang/blob/main/proposals/TypeUnions.md will be in place, it will everybody make happy: selenium team to write even less code, and user probably will write: var screenshot = await context.CaptureScreenshotAsync(new Box(5, 5, 10, 10));

2. Don't mimic BiDi instance as BrowsingContext ✅ v4.26

We have helper methods in BiDi class which actually forward to BrowsingContext module. Example:

await bidi.CreateContextAsync(...);

Stop doing it and just expose modules. So it would be better:

await bidi.BrowsingContext.CreateAsync(...);

3. Result object as Enumerable ✅ v4.26

Some commands return result, which seems to be a list of items.

var result = await context.Storage.GetCookiesAsync();

Where result is:

storage.GetCookiesResult = {
  cookies: [*network.Cookie],
  partitionKey: storage.PartitionKey,
}

So result is GetCookiesResult class, and it would be great if it behaves as enumerable.

Solution

Implement IReadOnlyList<T>. So user is able to:

var cookies = await context.Storage.GetCookiesAsync();

Console.WriteLine(cookies[0].Name);
Console.WriteLine(cookies.PartitionKey);

And it will be also good to rename result class to CookiesList (or CookiesReadOnlyList or CookiesCollection?)

4. driver.AsBiDiAsync() or driver.AsBiDiContextAsync() - only one should alive

We have 2 entry points into BiDi world, the both are useful. The problem is that driver.AsBidiContextAsync() holds underlying bidi connection, which cannot be disposed. And moreover, should not be disposed. So seems, only one driver.AsBiDiAsync() should be as single entry point.

Then the question is: if user has bidi instance, then how he can get an access to current BrowsingContext instance?

Solution

Keep only one entry point: driver.AsBiDiAsync(). And provide a way how to instantiate an instance of BrowsingContext.

await using var bidi = await driver.AsBiDiAsync();

var contexts = await bidi.BrowsingContext.GetTreeAsync(...); // already available

var context = new BrowsingContext(bidi, driver.Manage().CurrentWindowHandle); // is it good?
// or
var context = bidi.BrowsingContext.Create(driver.Manage().CurrentWindowHandle); // is it good?

Given that this is low-level API, I would like to not introduce any kind of "helper" methods.

github-actions[bot] commented 1 month ago

@nvborisenko, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

S-Kulyk commented 1 month ago

Wanted to get hands on the new BiDi interface, but struggled due to the lack of documentation starting from getting the BiDi object: when should driver.AsBiDiAsync() and driver.AsBidiContextAsync() be used?

Would be great to have similar samples to those that already exist for CDP https://www.selenium.dev/documentation/webdriver/bidi/cdp/network/

I understand that it might sound more like complaint rather than a feedback, but it's safe to assume I won't be the only one who will have similar questions when tinkering with BiDI

nvborisenko commented 1 month ago

Sorry for that, it even may a subject for change. This is why we are collecting any feedback.

Returning back to the question:

driver.AsBiDiAsync(); // returns an object who is on top of all "tabs"
driver.AsBidiContextAsync(); // returns the current "tab"

We are still lack of documentation, construction of API is a priority. And then docs will be in place.

nvborisenko commented 2 weeks ago

This is addressed and will be a part of v4.26, looking forward new feedbacks.

RenderMichael commented 1 week ago

For the ClipRectangle examples, we could have static factory methods which users can call:

[JsonPolymorphic(TypeDiscriminatorPropertyName = "type")]
[JsonDerivedType(typeof(Box), "box")]
[JsonDerivedType(typeof(Element), "element")]
public abstract record ClipRectangle
{
    internal record Box(double X, double Y, double Width, double Height) : ClipRectangle;

    internal record Element([property: JsonPropertyName("element")] Script.SharedReference SharedReference) : ClipRectangle;

    public static ClipRectangle FromBox(double x, double y, double Width, double Height)
    {
        return new Box(x, y, Width, Height);
    }

    public static ClipRectangle FromElement(Script.SharedReference element)
    {
        return new Element(element);
    }
}

And the users would just call these methods, and not have to deal with the potential subtypes:

[Test]
public async Task CanCaptureScreenshotOfViewport()
{
    var screenshot = await context.CaptureScreenshotAsync(new()
    {
        Origin = Origin.Viewport,
        Clip = ClipRectangle.FromBox(5, 5, 10, 10)
    });

    Assert.That(screenshot, Is.Not.Null);
    Assert.That(screenshot.Data, Is.Not.Empty);
}

This way, user ergonomics is preserved (it's an ideomatic style to have X X.From*(...) methods in .NET) and Selenium can handle the implementation as necessary.

nvborisenko commented 1 week ago

Right, I also considered static factory as a primary way to instantiate nested objects. Let's compare..

Today:

record Box(double X, double Y, double Width, double Height) : ClipRectangle;

Tomorrow:

record Box(double X, double Y, double Width, double Height) : ClipRectangle
{
    public int? Scale { get; set; }
}

In case of native ctor:

var box = new ClipRectangle.Box(5, 5, 10, 10) { Scale = 90 };

In case of static factory:

var box = ClipRectangle.CreateBox(5, 5, 10, 10) with { Scale = 90 };

Where with keyword is kind of unknown. And seems it allocates 2 objects (just guess, hopefully compiler is smart). And using with keyword looks the shortest statement in C#. Interesting how other CLS compliant languages will look..

There are so many sugar we can apply, but we should keep in mind that it is low-level API, it should be straightforward, simple, reliable. Any sugar needs effort, let's stay this greenfield for others. For others, who will write extensions.

RenderMichael commented 1 week ago

Tomorrow:

record Box(double X, double Y, double Width, double Height) : ClipRectangle
{
    public int? Scale { get; set; }
}

In this specific example I would add a new static method:

public static ClipRectangle CreateBox(double x, double y, double width, double height, int? scale) => new Box(x, y, width, height, scale);

public static ClipRectangle CreateBox(double x, double y, double width, double height) => new Box(x, y, width, height, Scale: null);

or add an optional parameter to the existing method (binary breaking change, shouldn't be a real issue)

public static ClipRectangle CreateBox(double x, double y, double width, double height, int? scale = null) => new Box(x, y, width, height, scale);

Where with keyword is kind of unknown. And seems it allocates 2 objects (just guess, hopefully compiler is smart). And using with keyword looks the shortest statement in C#. Interesting how other CLS compliant languages will look.

I agree the with keyword maybe isn't the right call. .NET 9 is only beginning to experiment with on-stack object allocations for narrow scenarios, so the JIT will probably end up making 2 objects.

Both solutions work, the question is mostly of taste (and how stable the BiDi API is going to be). I think the static methods are more discoverable than nested classes, but there's nothing wrong with the nested type approach.