taurus-org / taurus

Moved to https://gitlab.com/taurus-org/taurus
http://taurus-scada.org
43 stars 46 forks source link

Fix travis builds #1109

Closed cpascual closed 4 years ago

cpascual commented 4 years ago

(Edited and updated)

Try various approaches to fix #1042 :

cpascual commented 4 years ago

The usage of different project names does not seem to have effect. But the replacement of the tango-cs docker images by the newer ones from SKA does seem to improve the situation.

In this pipeline we can see that not a single job shows signs of crashed tango DS (before it was very common). But there are still some issues:

I've managed to reproduce both issues locally using tox, but they happen a lot less than in travis.

Note that the py36-qt5 issue is not related with this change of docker images. It also happened with the tango-cs images

...some more work needed

cpascual commented 4 years ago

Regarding the timeouts, maybe we could use pytest-timeout to get some more debugging info

cpascual commented 4 years ago

Hi @reszelaz , can you please review this and eventually merge so that we can properly test the other PRs?

The things that worked were:

Still missing (but can be done out of this PR):

reszelaz commented 4 years ago

But IIRC in TangoSchemeTest there are some events pushed but I don't remember if we use them in the tests.

I think that TangoSchemeTest should not be problematic - this one runs on the host and not in the docker container.

cpascual commented 4 years ago

I assume that the docker compose implementation (Makefile and yaml) come from the SKA's project, as they are, right? So, no need to review them. I have tried them locally (CI_JOB_ID=local make start tangotest) and these work as expected.

I did only some minor changes to SKA's files:

BTW I think the jive.yml is not strictly necessary?

No, but as I mentioned in the 4b5c81f, it does not harm because it shares the image with tangotest, and instead it helps a lot for debugging locally (specially on a Debian machine where Jive is not available as an official deb package)

Just a note, I have not managed to receive events on my host from the docker containers i.e. I receive API_EventTimeout. And I think that the testsuite run on the host on Travis, righ? I imagine this is normal cause we don't open the ZMQ ports (which are unknown beforehand). Or maybe I miss something.

AFAIK, It is not about ports being open, but about network separation. The containers are using a bridge network which is isolated from the host network. The containers can see each other fully and exchange events between them, but not with the host (the host may reach the containers by IP, but not by name). On top of that, we do the link of databaseds:10000 -- host:10000, so that the databaseds really serves both the host network and the containers network.

Therefore if we really need to do tests involving events, there are some alternatives:

  1. we can spawn the required DS in the host network (this is what we do with TangoSchemeTest as you already mentioned)
  2. We can spawn the DS as a container and also the client as a container (so that both are in the container network)
  3. we can try to use a different network configuration in the composer (maybe using a single network)

For the moment, I'd say that option 1 is enough for us

Also, maybe as an idea to evaluate in the future, if we would like to base the tests on TangoSchemeTest DS only then we could think of refactoring the tests to use tango.test_context.DeviceTestContext. Then we wouldn't need the database, tests will be much easier to run for developers and this will automatically open testing on Windows.

Yes, this has been mentioned before (I even thought there was an open issue, but I could not find it). I agree that this would simplify the current testsuite and allow for better unit tests. But still, one thing to note is that with docker-compose we can potentially do more complex and closer-to-production functional tests than with DeviceTestContext (and it also allows testing on Windows since the tango server infrastructure is run on containers)

reszelaz commented 4 years ago

I should have read the particular commits messages, this would have saved some time:)

Yes, I actually wanted to mean expose ports when I said open ports, sorry if it was not clear.

Regading the testing on Windows I was referring to CI wtih Appveyor. I based this on my attempt to configure Appveyor by re-using as much as possible our already existing Docker images. I remeber that it was not possible without going for Pro accounts.. But this was a year ago, so things may have changed.