Decouple session management from test execution

codiophile commented 5 years ago

Is your feature request related to a problem? Please describe. Since I started using webdriverio two years ago, the main problem I've had is the execution model, specifically that session management is tightly coupled with test execution. One spec file equals one session. There is no possibility to have different desired capabilities for different scenarios and the size of a spec file is limited by the session timeout mandated by your device/browser provider. We are in fact running a custom version of webdriverio 4 with custom hooks for improved session management, but we are still using the same execution model.

Describe the solution you'd like When moving to webdriverio 5, we would like to move away from this execution model. The easiest way of managing this is to let each scenario create their own session, but that can be quite a bit of overhead depending on how expensive your session creation is, so reusing sessions makes sense. What I'm proposing is to have a session manager that keeps track of all active sessions and creates new ones when needed. Each scenario will ask the session manager for a session with specific capabilities and if one exists the scenario will get that one, otherwise a new one will be created.

To keep it simple and somewhat backwards compatible, this could be handled automatically in the background. If a scenario starts using the browser object without asking for a session first, it will be assumed that the scenario wants a session with the default capabilities. The browser object will acquire a session from the session manager and perform the requested action, without the user having to worry about interacting with the session manager.

Describe alternatives you've considered If this is not built into webdriverio, I could write my own session manager that uses webdriverio in standalone mode and use cucumber directly to drive the execution. This has the disadvantage of not being able to use custom wdio reporters and other things integrated with the wdio test execution.

Alternatively I might be able to use webdriverio's test execution and instead of using the browser object, use my own standalone webdriverio sessions. However, I'm not sure that's possible, as I'm not sure if you can run in both modes simultaneously.

Additional context We have a scenario where we need to interact with two apps on the same device. Our standard session only installs and opens one of the apps, but for a small subset of scenarios we need a session that installs both apps and opens the other app. We're not sure how to implement this, but it's clear that we need more control over session management than we currently have. Any advice on how to solve this problem will also be appreciated.

mgrybyk commented 5 years ago

@codiophile as a suggestion you can start test with the main app you are testing in capabilities. When needed you can change capabilities to use another app for next session in runtime and reload session within same spec file. Now you have both your apps installed. Run/terminate/open both of them with Appium

tejasv02 commented 5 years ago

@mgrybyk - does it mean both sessions can be active at same time ? If so how to switch between sessions ?

mgrybyk commented 5 years ago

@tejasv02 no, it doesn't. You can have only one active session same time unless you are using MultiRemote. You won't be able to have two active sessions to same mobile device, I think it's Appium limitation.

However Appium allows to run/open not just apps that are specified in capabilities.

christian-bromann commented 5 years ago

Have you considered to use Multiremote?

mgrybyk commented 5 years ago

Additionally, if you want to keep session across multiple tests you can create a spec file that includes all your tests.

Example structure

test-actions/
- t1.js
- t2.js

tests/
- test.js

test.js can look like this

require('../test-actions/t1.js')
require('../test-actions/t2.js')

codiophile commented 5 years ago

@mgrybyk Let's see if I understood you correctly. The refactoring is coming, but until it does, which might be a while, any improvements to session management is not to be expected.

To solve our problem in the meantime, you are suggesting that we do a browser.reloadSession(). The documentation says that it "Creates a new Selenium session with your current capabilities." It doesn't mention anything about changing the capabilities. Would the following work?

browser.desiredCapabilities.app = "the other app";
browser.reloadSession();

mgrybyk commented 5 years ago

@codiophile you can change browser.options.requestedCapabilities. You need to be careful there because requestedCapabilities has a special format.

Then you can reloadSession

christian-bromann commented 5 years ago

There is no possibility to have different desired capabilities for different scenarios

You can just split the scenario in multiple files.

and the size of a spec file is limited by the session timeout mandated by your device/browser provider

Not sure what kind of timeout you are talking about. As long as you send commands over the wire the session can go endlessly. This is at least how it is on desktop but I doubt a device shuts down when running long automation tests.

What I'm proposing is to have a session manager that keeps track of all active sessions and creates new ones when needed.

When are new sessions needed?

but for a small subset of scenarios we need a session that installs both apps and opens the other app.

Have you considered to run two wdio calls? One that is only using the app with one app and another one that is running a multiremote test to use both apps.

but it's clear that we need more control over session management than we currently have

I don't see the necessity to change anything in the current session management. One session per spec file is a simple concept that is easy to understand. There is no reason to shoehorn different automation scenarios into one test run. You always can create multiple configs that run different test files based on their requirements.

@codiophile please elaborate on what kind of problem you are trying to solve here , what your solution is you propose and how it helps the users of the project. I am happy to consider any sort of change to the framework if it makes sense to the majority of users but refuse to make any complex changes to make it fit for a single person.

tejasv02 commented 5 years ago

Additionally, if you want to keep session across multiple tests you can create a spec file that includes all your tests.
@mgrybyk 
We are using Cucumber JS, Any suggestions ?

mgrybyk commented 5 years ago

Can't help with Cucumber

mgrybyk commented 5 years ago

After some investigation I realized that keeping session or any shared object across multiple child processes is something not easy to do. It might be easier to create another runner instead for particular needs at least at this point

christian-bromann commented 5 years ago

ping @codiophile .. can you address the comments?

It might be easier to create another runner

The current runner (@wdio/local-runner) is a plugin like any a service or a reporter. Its interface can be copied and modified that it spawns a child process running multiple specs. So technically anyone can create his own runner that manages sessions differently.

codiophile commented 5 years ago

@mgrybyk Thank you for investigating this further. :)

The current runner (@wdio/local-runner) is a plugin like any a service or a reporter. Its interface can be copied and modified that it spawns a child process running multiple specs. So technically anyone can create his own runner that manages sessions differently.

That is a very interesting proposition. If we write a session manager to be used with a custom runner, then we would still be plugged in to the wdio eco-system, which should make migration easier and allow us to continue using other wdio plugins.

I know our framework is written with the assumption of one spec file, one process, one session and it's not inconceivable that parts of wdio, outside of the runner, are written with the same assumption. @christian-bromann do you reckon that the rest of wdio is independent enough of the runner to accommodate the swap?

I once modified the execution of cucumber to run all scenarios in parallel in the same process, taking advantage of the async nature of Node.js. It worked very well and only a few scenarios (cucumber's tests for itself), primarily related to timeouts, broke. From a performance point of view, it had no problem beating the built in parallelisation that is using child processes. The child process reached similar levels of performance if you chose the right number of children, which if I remember correctly was the number of cores minus one (leaving one core for the parent). If you had more children, you were wasting time on context switching and if you had fewer, you weren't utilising all the cores. The reason it was struggling to beat the single process model is that when all child processes were waiting for IO, there was no process actively doing any work, whereas the single process model would just move on to work on the next scenario until all of them were waiting for IO. In our case, the limiting factor is the number of devices available to us, so the execution model wouldn't have much impact on performance. My argument is just that it is a very performant model for IO limited testing.

Anyway, the reason I'm telling you this is because I've always been intrigued by the idea of leveraging Node.js' native execution model to parallelise test execution. It is a very good execution model for anything primarily limited by IO, which is certainly the case for mobile testing using Appium and probably any testing that uses webdriverio. Also, by using a single process execution model we get around the following problem:

After some investigation I realized that keeping session or any shared object across multiple child processes is something not easy to do.

So, what obstacles do you see in implementing a single process runner with custom session management? Also, how long do you think it would realistically take? (I'd just like to get an idea of how big a project you think it would be.) Is implementing the interface of the runner enough or would we also need to modify the cucumber plugin and maybe even cucumber itself?

christian-bromann commented 5 years ago

do you reckon that the rest of wdio is independent enough of the runner to accommodate the swap?

Yes. The runner is responsible for propagating commands from the launcher to the @wdio/runner package. The local runner currently creates a worker process for every command but you can collect multiple commands and execute them e.g. in one session. That said, the current implementation still assumes one worker per spec per capability. I am open to move implementation details from the cli to the runner to make runners more independent.

So, what obstacles do you see in implementing a single process runner with custom session management?

I can't tell that because you haven't addressed my comments above which essentially question the use case.

Also, how long do you think it would realistically take?

Depends on the answer to the question above. Generally I am hesitant to any changes that involve session management. In 95% of all use cases the 1 process per spec per capability works and provides enough ability to run tests concurrently as well as offers encapsulation. For the other 5% use case I would advocate to ensure that WebdriverIO is extendable in a sense that it allows to create custom runners/services/reporters so that it fits for their special case.

codiophile commented 5 years ago

In 95% of all use cases the 1 process per spec per capability works and provides enough ability to run tests concurrently as well as offers encapsulation. For the other 5% use case I would advocate to ensure that WebdriverIO is extendable in a sense that it allows to create custom runners/services/reporters so that it fits for their special case.

I get that. The execution model is good for 95% of the use cases. The problem is that the 5% of the use cases take 95% of the time to implement and often end up not being implemented at all and instead part of a manual test pack. :)

My view of WebdriverIO and its execution model might be a bit skewed. I'm the technical lead for automated testing for a range of mobile apps that are worked on by multiple teams in different locations. I'm not bothered by the 95%, because the teams are able to automate those scenarios without my support. It's the 5% that ends up on my backlog, because they can't figure out how to automate it by themselves. I also work on the performance of the overall execution. We have a limited set of devices and limited time, so we need to make sure it's utilised as well as possible. Finally, reliability of the execution is also something that falls under my responsibility. Unfortunately, our device provider doesn't always provide reliable devices. I need to find ways to work around this to report a pass percentage that reflects the quality of the app and not the quality of the test devices.

With the kind of work I do, the one spec file, one session, one process model often gets in my way. I will share three examples with you, including the one I'm currently working on. I'll share them in chronological order, rather than in order of importance. :)

When I joined the organisation, the automated testing was performing very poorly in every way imaginable. The test version of the app had a settings screen before launching the actual app, where you could choose things like which environment you are running on and which features you want enabled. It had a long list of environments and it was tailored towards ease of use for manual testers. The problem was that automation was also using this screen, scrolling until it found the desired environment and doing multiple taps until the correct options were selected.

I wanted to use launch arguments to configure the app, completely without UI interaction. Instead of spending minutes configuring the app, we would be spending milliseconds. Because launch arguments are set in desired capabilities before session creation, we couldn't configure the app differently between different scenarios in the same feature file and reorganising all of our feature files to accommodate for this would not only be a lot of effort, but would also create a structural mess, since the organisation would have to be based on configuration rather than feature.

We ended up adding an input field on the settings screen, so that the automation can pass all the options as text in one field and tap a button to start the app. Now we were spending seconds instead of minutes on that screen, but with a different execution model with control over session management, we could have completely eliminated the time spent on configuring the app.

The second example is one that you might remember from a previous merge request I made. When our device provider failed to provide a session or provided a session with a bad device, all scenarios in that feature file would either not be executed or all fail. This is still largely the case, but I introduced a hook that allows us to retry session creation when it fails, which has greatly mitigated this problem. Because of this, the number of scenarios executed used to fluctuate quite a lot and we had a separate report to report execution percentage. We still have it, but because things are more stable now, people rarely look at it. With custom session management we could reject a bad session within a scenario and ask for a new one, which would allow that scenario to pass. Sometimes a device stops working mid-execution, so it's not just at session creation things can go wrong.

The third scenario is the one that we are currently working on, although it's on hold while I'm investigating a more urgent issue. You can open our app from a third party app to request access to user data. So we need to install two apps, our app and the third party app. We need to start in the third party app, make it open our app and test that our app responds correctly to the request. In fairness, we might be able to get this to work with the current execution model using the suggested reloadSession method, but with decoupled session management, this would be more natural to implement. Starting and stopping sessions within a scenario would be a standard capability, rather than a hack.

I hope this answers your question. Here are some direct answers to direct questions:

Not sure what kind of timeout you are talking about. As long as you send commands over the wire the session can go endlessly. This is at least how it is on desktop but I doubt a device shuts down when running long automation tests.

We had one provider that we are not currently using that was closing every session after 15 minutes. This forced us to split our feature files into chunks that were executable within 15 minutes. This is an arbitrary timeout, but so is the one session per spec file model. With custom session management we could have kept our feature file any size we wanted and just renewed the session when appropriate.

When are new sessions needed?

When a scenario requests a session with different desired capabilities. For example, because we are injecting configuration with launch arguments.

Have you considered to run two wdio calls? One that is only using the app with one app and another one that is running a multiremote test to use both apps.

I don't think that would work, since Appium only allows one active session per device. We can't have one session for each app and switch between them. We will probably have to install our app with one session, close it, then create another session to install the third party app and from that session we should be able to perform the rest of the test. Unfortunately, we haven't had time to test this further, because of another more important issue.

There is no reason to shoehorn different automation scenarios into one test run. You always can create multiple configs that run different test files based on their requirements.

Our automation framework already creates multiple wdio test runs and aggregates the results. However, each test run is still limited by the one spec file, one session model. We could write tests entirely outside of our automation framework, but we would still have to incorporate the results into our test reports, which are currently generated by the automation framework.

I am happy to consider any sort of change to the framework if it makes sense to the majority of users but refuse to make any complex changes to make it fit for a single person.

We are running thousands of tests, if not tens of thousands, every day using WebdriverIO with its current execution model. I'm not trying to argue that it isn't working for the 95%, because clearly it is. I just want the 5% to be easier to manage and implement. If it gets easy enough that they are able to deal with them directly in their teams, I would effectively be redundant, which is the ultimate goal of any automator. ;)

@christian-bromann If I missed a question or if there are any more questions, please let me know. :)

christian-bromann commented 5 years ago

If I missed a question or if there are any more questions, please let me know.

Would it help you if you could group specs as follows:

// wdio.conf.js
export const config = {
    // ...
    specs: [
        './tests/e2e/a.test.js',
        './tests/e2e/b.test.js',
        './tests/e2e/c.test.js',
        './tests/e2e/d.test.js',
        './tests/e2e/e.test.js'
    ]
    // ...
}

While the above would create 5 sessions for 1 capability, 10 sessions for 2 capabilities and so one, the following:

// wdio.conf.js
export const config = {
    // ...
    specs: [
        [
            './tests/e2e/a.test.js', // \
            './tests/e2e/b.test.js', //  } - this is run in one session
            './tests/e2e/c.test.js'  // /
        ],
        './tests/e2e/d.test.js',
        './tests/e2e/e.test.js'
    ]
    // ...
}

would create only 3 sessions for 1 capability, 6 sessions for 2 etc. I think it allows to control which specs are being run in one session without adding much complexity.

mgrybyk commented 5 years ago

The question was regarding Cucumber as far as I've understood

christian-bromann commented 5 years ago

The question was regarding Cucumber as far as I've understood

Right, the same would work for Cucumber feature files. Managing sessions within a feature file seems unnecessary for me.

tejasv02 commented 5 years ago

@codiophile you can change browser.options.requestedCapabilities. You need to be careful there because requestedCapabilities has a special format. Then you can reloadSession

Thanks @mgrybyk - it works fine.

codiophile commented 5 years ago

@christian-bromann We are now in a situation where we need two sessions in one scenario. You can't split a scenario between spec files. There is no other solution than to manage sessions within a feature file. The minimum session management we can get away with is to stop one session and start another one, while leaving the rest of the session management as it is.

Generally, my opinion is that scenarios should be grouped by feature and the folder structure should be organised in a way that is logical to the user of the app. Gherkin is meant to be an executable specification and should reflect the functionality of the app, rather than execution details of the test framework. I think this is generally the way to organise your tests, whether using Gherkin or not.

Our feature files and folder structure are already littered with execution details. For instance, prefixing feature files to enforce a certain execution order is easier than ensuring that all tests are independent and people take shortcuts. It's hard to avoid this kind of things in a big organisation, but the least we can do is try to design the framework in a way that minimises the executions influence over the Gherkin. The business analysts should be able to write the feature files and organise them in a way that makes sense to them. The engineers shouldn't have to mangle them to fit into the execution. Preferably, the engineers should be able to make them run without changing anything.

Anyway, I'd like to steer this conversation towards the idea of writing a custom runner. I had a look at the source code and while the local runner is the one spawning the child processes, the concept of workers is already there in the launcher. I suppose you could run everything in the same process without the launcher having to know that there isn't actually any workers.

A bigger problem is that the global browser, driver, $ and $$ objects would have to be replaced with proxies that would somehow be able to redirect any calls to the correct session. I don't know exactly how that would be accomplished. Worst case, we will just have a global sessionManager and the tests would have to get their browser objects from there.

It seems like the smallest executable unit accepted by the framework plugins is a spec file. To run every test or scenario of that spec file in parallel in the same process, the chosen framework would have to support it. Even if we are only executing spec files in parallel in the same process, it will only work as long as the chosen framework is able to run multiple instances in the same process, which means that it cannot rely on globals to keep track of the execution.

To reduce the amount of work required to get a runner with custom session management, you could probably get away with keeping the multi process architecture and simply not reuse sessions between processes. A problem with having multiple session managers is that they might end up fighting over the same resources. One session manager might be holding on to a session with hope of reusing it in the future, while another session manager fails to create a session, because the device needed to create the session is used by another session manager's idle session. Worst case scenario, this will create a deadlock, because they could be waiting for each other.

To get around this problem we could allow only one session per session manager. This still gives us all of the flexibility, but none of the optimisation. Implementing this would, in addition to implementing the session manager, only require a small change in the runner, where instead of creating the session in the runner, only the session manager will be created. We could even have the session manager be a proxy for the browser and the other globals, since it will only have one active session at any given time. This would make it completely transparent to the end user, as the only difference would be the timings of when the hooks are executed. As this could be significant foe some, we could hide this new behaviour behind a configuration option. The session manager could actually create a session at the same time as the runner would have created the session before, which would keep everything the same for the end user, while giving her all the functionality of the session manager.

I think the cleanest way to include this would be to add the package @wdio/session-manager. The runner would look for the presence of the session manager and use it if it exists or use the old behaviour if it doesn't. It could be an option you would have to enable in the config, but my preference would be to keep it consistent with how @wdio/sync is enabled.

This would effectively decouple the session management from the test execution, as you would be able to run a test without actually having a session. Also session failures would result in test failures rather than in tests not being executed. It would solve all of the problems mentioned in my previous comment and performance would stay the same, since the execution model didn't change. We could still change the execution model in the future, to optimise the session management and the performance, but since that is a much bigger task, I would put that on ice for now. We could potentially pass sessions between session managers through IPC to optimise the session management, but I'd rather stay away from that rabbit hole for now.

I might go ahead and implement this with or without your agreement, but do you see any design flaws in my proposed solution that might keep it from getting merged? We are running a custom version of WebdriverIO 4, to be able to retry session creation. Having a session manager that supports retries seems like the best way to port that to WebdriverIO 5. If it doesn't get merged, I'd only have to maintain the session manager and a custom runner, rather than a fork of the whole thing, so I'm likely to give this a go regardless. However, I'd very much like to implement this in a way that allows it to get merged.

mgrybyk commented 5 years ago

we need two sessions in one scenario.

@codiophile so use multiremote

codiophile commented 5 years ago

we need two sessions in one scenario.

@codiophile so use multiremote

You already answered that one:

You won't be able to have two active sessions to same mobile device, I think it's Appium limitation.

However Appium allows to run/open not just apps that are specified in capabilities.

@mgrybyk do you have any feedback on my proposed solution? Do you see any reason why it wouldn't work or why it wouldn't get merged?

mgrybyk commented 5 years ago

I think that multiremote do just what you want. Can you check it out?

christian-bromann commented 5 years ago

One session manager might be holding on to a session with hope of reusing it in the future

This is very bad if you run on a cloud provider and you keep a session alive in "hope" to reuse it and you end up wasting your minutes.

it will only work as long as the chosen framework is able to run multiple instances in the same process

I think this will be really tricky. One of the reason to run one spec file per process is to allow run tests concurrently in the first place. Frameworks like Mocha or Cucumber don't come with a concept of parallelization.

but do you see any design flaws in my proposed solution

IMO this session manager makes the test process a lot more complex and potentially fragile. Note my comment above that test frameworks don't come with a concept of parallelization meaning that you can't run two mocha test suites programmatically in one process. Due to package caching and globals in these frameworks you will run into problems sooner than later.

I might go ahead and implement this with or without your agreement

That is fine. I am currently not convinced that it is a good idea to built this into the core. I am happy to support you by making individual parts of WebdriverIO more plug-able (e.g. remove concept of worker in the launcher so that it is more generic). I am looking forward to try your implementation and maybe it will convince me to bring it into the project as a core thing. However there is nothing wrong with making this framework extendable in a sense that it supports a session management concept as you proposed it here.

I will close this issue but I am happy to continue the conversation on this nonetheless.

webdriverio / webdriverio

Decouple session management from test execution #4523