SeleniumHQ / selenium

A browser automation framework and ecosystem.
https://selenium.dev
Apache License 2.0
29.73k stars 8.02k forks source link

[🚀 Feature]: Ability to customize driver process start (.net) #14175

Open rasmushalland opened 1 week ago

rasmushalland commented 1 week ago

Feature and motivation

Hi,

I recently switched from using RemoteWebDriver and launching chromedriver myself to using ChromeDriver to get access to chrome debugging protocol features. Launching chromedriver myself allowed me to launch it in a Windows "job object" (https://learn.microsoft.com/en-us/windows/win32/procthread/job-objects), so that chromedriver and chrome processes are killed by windows when the test process terminates, thereby avoiding leaked process that stay around. A search for related issues in this repo suggests that containing the driver processes would solve some issues that people are having (https://github.com/SeleniumHQ/selenium/issues?q=is%3Aissue+chromedriver+process+label%3AG-chromedriver).

But ChromeDriverService seems to not allow much flexibility as to how to launch the process. I've made it work using some ugly reflection and exception throwing, but I think that there is a better way to allow this, and that way is quite simple.

Using windows job objects is obviously quite windows specific, but the same kind of process containment would probably also be attractive on other OSes, using their respective mechanisms for doing so.

What I currently do is this:

sealed class ProcStartBreakException : Exception { }

static Process CustomStartProcess(ProcessStartInfo startInfo)
{
    // ...
}

public static Process StartWebDriverService()
{
    Process? webDriverProcess = null;
    DriverService ds = ChromeDriverService.CreateDefaultService();
    ds.DriverProcessStarting += (sender, args) => {
        webDriverProcess = CustomStartProcess(args.DriverServiceProcessStartInfo);
        throw new ProcStartBreakException();
    };

    try
    {
        ds.Start();
        ds.Dispose();
        throw new Exception("No exception was thrown? Got proc: " + (webDriverProcess != null));
    }
    catch (ProcStartBreakException)
    {
        // great.
    }

    if (webDriverProcess == null)
    {
        ds.Dispose();
        throw new Exception("Did not get process?");
    }

    {
        var procfi = typeof(DriverService)
            .GetField("driverServiceProcess", BindingFlags.Instance | BindingFlags.NonPublic)
            .ValueEx();
        procfi.SetValue(ds, webDriverProcess);
    }
    {
        var waitMi = typeof(DriverService)
            .GetMethod("WaitForServiceInitialization", BindingFlags.Instance | BindingFlags.NonPublic, null, Type.EmptyTypes, null)
            .ValueEx();
        var wait = waitMi.CreateDelegate<Func<DriverService, bool>>();
        wait(ds);
    }

    return webDriverProcess;
}

That is, I use the DriverProcessStarting event to start the process in a custom way, then interrupt the ds.Start() call by throwing an exception, and finally use reflection to perform the steps that ds.Start() would normally perform, except for actually starting the process.

Proposed solution:

Add a property like public Process? CustomLaunchedProcess { get; set; } to the args argument of the DriverProcessStarting event: If the event handler wants to start the process, after doing so it store the Process object in CustomLaunchedProcess. Upon return from the event handler, ds.Start() observes that the property contains a process, and uses that instead of launching it.

If this solution is acceptable, I could probably make a PR.

Alternatives:

  1. I could react to the other event, DriverProcessStarted, and add the process to the Windows job. That would be less robust: It would create a window during which the process is not in a job and therefore would leak, were the calling process to terminate. It is a small window, but the problem is not theoretical, as explained by Raymon Chen: https://devblogs.microsoft.com/oldnewthing/20230209-00/?p=107812. I would also not be surprised if containment techniques on other platforms don't support adding an existing process, but require that the process is started in a special way.
  2. Place the calling process in a job object that is configured to be inherited by all child processes. That would solve this problem, but also be a very intrusive solution, since it affects all child process, not just webdriver related ones.

Usage example

After running browser tests, there would not be leaked processes around. Also, a mechanism like this might make it easier to establish additional security around the browser processes.

github-actions[bot] commented 1 week ago

@rasmushalland, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!