Examples of CI and/or unit-testing with the Orchestrator

m8pple commented 3 years ago

I've been trying to work out how to test/automate interactions between applications and the orchestrator, for a few purposes:

Trying to test whether the Orchestrator can process the v3 and v4 XML in the benchmarks set (in case we re-generate them).
Comparing performance between different POETS implementation (e.g. Orchestrator vs POLite vs ImPOLite)
Adding unit tests to existing+new applications so that they will work with the Orchestrator
Getting DPD working in the Orchestrator

However, I've got stuck on how to actually automate this, and check whether things work or don't work for hundreds of applications. I think you're using a CI-server so this must be a solved problem, but I'm afraid I wasn't able to find where you do this in the various repositories. It's possible this is all off in some CI server config, which isn't in github?

The only testing stuff I was able to find was the placement tests in Orchestrator/Testing, but that is all related to placement stuff. In the heat_plate example I can see information related to how the graph works, and in the 1.0.0-alpha branch I can see that it generates a batch script. But I can't find anything about how you actually start it, check that it has compiled and loaded, run it, and then wait for it to finish.

It would also be useful to see your repository of input xml used for testing, so that I can see examples of Supervisors, as currently they are still un-supported by the simulation and analysis tools. At some point it would be nice to update the tools so that they can work with supervisors, e.g. for model checking and so, but that needs examples to test.

So what would be useful when the 1.0.0-alpha release comes out would be:

Provide some "starter" examples for testing in the orchestrator, e.g.: a. Publish existing unit/integration testing code in the POETSII org as a repo; or b. Extract 2 or 3 examples of unit-testing and add them to the Orchestrator repo; or c. Extend the batch processing part of the documentation to describe how to do it.
Some suggestions of best-practises in how to handle validation and execution of applications, e.g.: a. How to check that applications compile and link on a local server, as part of a local set of unit tests. b. Suggestions on how to do hardware-in-the-loop testing, particularly using supervisors, as I have no idea how to do thi).
Link to the set of XML being used for testing of the orchestrator. I'm guessing there is a set-union of XML from various sources somewhere, but I couldn't find it. Probably it is not in git due to size - I think the benchmarks we sent you were actually in Box for that reason.

heliosfa commented 3 years ago

I think you're using a CI-server so this must be a solved problem

We don't currently have a CI server running - it is one of the many things on the list.

Automated Orchestrator exit is not something that we have implemented yet either, so full automated testing of lots of XMLs in succession is awkward.

Provide some "starter" examples for testing in the orchestrator,

The Orchestrator_examples repository includes a few test XMLs for testing basic functionality with the Mothership and Orchestrator in general. They (should) all write an output file with their exit code.

We have considered adding the examples repo as a submodule of the Orcestrator repo.

a. How to check that applications compile and link on a local server, as part of a local set of unit tests.

Everything up to an including the compose /app step can be done without attached POETS hardware, as long as the Orchestrator dependencies are available.

It doesn't even have to be the versions in the dependencies tar - the Orchestrator build works with the MPI bundled with Ubuntu and I have removed the Supervisor shared Object's reliance on MPI (so it can be compiled with GCC).

b. Suggestions on how to do hardware-in-the-loop testing, particularly using supervisors, as I haveno idea how to do thi).

Hardware-in-the-loop testing is obviously rather opaque, which is why MPIB is a "thing". Integrating this more fully into the Orchestrator is on the cards: at the moment it is a manual affair, but it does allow you to run a (small) app locally with a Supervisor.

Link to the set of XML being used for testing of the orchestrator. I'm guessing there is a set-union of XML from various sources somewhere

The "test" XMLs we have are in the Orchestrator_examples repo. For larger XMLs for testing (e.g. heated plate), I have gone down the provide some small pre-built and a generator to save on space.

m8pple commented 3 years ago

I think you're using a CI-server so this must be a solved problem

We don't currently have a CI server running - it is one of the many things on the list.

Ak, ok. Sorry, I thought Mark had set that up, I remembered him talking about CI and saw various .circleci files around so thought there was background magic.

Automated Orchestrator exit is not something that we have implemented yet either, so full automated testing of lots of XMLs in succession is awkward.

Ok. I wondered if there was some kind of undocumented command in there.

Everything up to an including the compose /app step can be done without attached POETS hardware, as long as the Orchestrator dependencies are available.

It doesn't even have to be the versions in the dependencies tar - the Orchestrator build works with the MPI bundled with Ubuntu and I have removed the Supervisor shared Object's reliance on MPI (so it can be compiled with GCC).

Yes, that's worked well - I've been able to compile locally quite successfully, which has been useful while visiting the pub with no wifi. I just haven't been able to work out whether they succeeded within a script. Maybe some combination of timeout and expect will do the job.

b. Suggestions on how to do hardware-in-the-loop testing, particularly using supervisors, as I haveno idea how to do thi).

Hardware-in-the-loop testing is obviously rather opaque, which is why MPIB is a "thing". Integrating this more fully into the Orchestrator is on the cards: at the moment it is a manual affair, but it does allow you to run a (small) app locally with a Supervisor.

Yup, Shane has quite a time getting hardware-in-the-loop overnight testing working with Tinsel 0.1 - I never quite understood how it worked though, so it stopped working when he left.

Link to the set of XML being used for testing of the orchestrator. I'm guessing there is a set-union of XML from various sources somewhere

The "test" XMLs we have are in the Orchestrator_examples repo. For larger XMLs for testing (e.g. heated plate), I have gone down the provide some small pre-built and a generator to save on space.

There are a set of XMLs of various shapes and sizes from the v4 discussion which could be used:

https://github.com/POETSII/poets_improvement_proposals/tree/master/proposed/PIP-0020/xml : The UoS ones are mix of v3 and v4, but the IC ones are all v4.
https://imperialcollegelondon.app.box.com/v/poets-benchmarks-2019-09-06 : These are the set of pre-generated v3+v4 benchmarks created just after v4.

In the last couple of years more examples have become available. None of them have supervisor devices though.

m8pple commented 3 years ago

With the towards-ci tag it seems there is velocity here.

Reading back my comments: apologies if it seems I was berating for the lack of CI. I just genuinely thought that you had CI set up, and was looking for pointers on how to do it, as we always found it difficult.

Assuming the towards-ci tag is basically trying to resolve this, I'm happy for this to be closed (by anyone), unless it is useful to keep it open.

heliosfa commented 3 years ago

Discussion points to enable automated exiting after application runs have completed:

We need a "wait" that waits for the current application stop flags to be set.
We need to intercept an "exit" in the batch file, set a flag and call CMExit once the batch is finished.

m8pple commented 3 years ago

Note the suggestions in #208 and #209 as starting point for people finding this issue who are interested in CI. Also PR #264 has a more up to date version of it.

m8pple commented 2 years ago

Closing. This is just adding noise, and we have no resources to deal with it.

POETSII / Orchestrator

Examples of CI and/or unit-testing with the Orchestrator #198