I am currently using a custom solution to do concurrent testing of hex files built using twister. I'd be happy to use Twister v2 if there is feature parity. That means we would need:

Pytest

This is planned so no action required.

Async

We use the Python websockets library to speak to a server during tests.

This allows nice patterns such as

async for message in backend:
  # do thing with message

Since you're using pytest I don't see why this wouldn't still work, but it's good to be aware of.

@pytest.mark.asyncio_cooperative
@pytest.mark.parametrize("test_case, chip", testdata)
async def test_foo(...):

This is how my test cases look under this scheme.

Concurrency

From above the

@pytest.mark.asyncio_cooperative

allows us to run the tests concurrently. This allows running a large number of test cases on a single core (so it's not parallel, just concurrent). I see you have some plans to use xdist which looks like it puts tests on different cores. I'm not sure if xdist blocks if num_tests > num_cores or if there is some interesting threading going on.

Imo. the best approach would be both: Run async tests on several cores. This would allow a very big number of tests running concurrently.

Mocking

Our solution allows:

async for message in real_backend:
  # do thing with real message

async for message in mock_backend:
  # do thing with mock message

So the test logic can be re-used with a mocked device. The mocked device is instrumented by giving it a text file containing the UART logs from a real run of the test case. The mock_backend then feeds it back line by line just like a real device would.

This is great for rapid prototyping of test logic, and to detect if a device is being flaky (by also running the test case mocked we see that the logic works, but the device does not).

Mocking is a nice feature but I'm not sure what your feature scope is.

Queuing

Like twister v1, we have a concept of waiting around for devices. We use labels for devices.

# Wait here until we control any device matching this label, e.g. "9160"
device = await control_device(label)

When the device is under control, it should be guaranteed that no other test case will interfere with this device.

Independent serial port control

When control over a device is gained, it might have several VCOMs that need to be interacted with, both reading and writing. There should be support for this.

Observability

This is more of a bonus, but it's useful to be able to view serial port activity from the "outside". So if 10 minute tests are running in Pytest, be able to do ./some-command --follow-activity /dev/ttyACMx for example.

Devices accessed remotely

The solution we now have uses Pytest as the "frontend", but the devices are accessed behind a server and connected to via an IP address. This means opening up for use cases such as several test nodes pointing to some shared location of devices.

zephyrproject-rtos / twister

User story: Pytest + async + concurrency + mocking + queuing ++ #34