Travis CI Timeout - Githubissues

Soroosh129 commented 4 years ago

There is a 50 minutes timeout imposed on Travis CI which the Python branch occasionally exceeds and regularly comes very close to. As more tests are added, we are sure to exceed this limit soon.

Any suggestions how we could fix this?

cxbrooks commented 4 years ago

The Ptolemy II Travis CI is split in to a bunch of different runs. See https://wiki.eecs.berkeley.edu/ptexternal/Main/Travis#Multiple_Jobs for some notes. My understanding is that travis-ci.org is going away, so I'm going to look into moving the ptII, kepler and accessors builds over to travis-ci.com

lhstrh commented 4 years ago

Before we resort to splitting builds, we should parallelize tests so it takes less time to complete the build. We currently just cycle through all tests in sequentially. It would be desirable to exploit multiple cores not just to stay within Travis' timeout, but also to get quicker results when run-lf-tests is run locally.

cxbrooks commented 4 years ago

Good point about parallelization. However, I'm not sure how many cores you get in one build.

https://travis-ci.community/t/is-there-any-way-to-get-2-physical-cores-in-a-travic-ci-environment/9758 suggests "1 core with 2 hyperthreads".

https://docs.travis-ci.com/user/reference/overview/ indicates 2 cores for all environments.

Still, parallelizing the tests would be a good thing and would push off the need for more runs for awhile.

Getting the ptII build to work on multiple took awhile to figure out. https://github.com/icyphy/ptII/blob/master/.travis.yml and https://github.com/icyphy/ptII/blob/master/bin/ptIITravisBuild.sh would be worth a read.

lhstrh commented 4 years ago

Yeah, we’ll have to see about the number of cores we actually get... Thanks for those pointers, Christopher!

On Thu, Sep 24, 2020 at 8:52 PM Christopher Brooks notifications@github.com wrote:

Good point about parallelization. However, I'm not sure how many cores you get in one build.

https://travis-ci.community/t/is-there-any-way-to-get-2-physical-cores-in-a-travic-ci-environment/9758 suggests "1 core with 2 hyperthreads".

https://docs.travis-ci.com/user/reference/overview/ indicates 2 cores for all environments.

Still, parallelizing the tests would be a good thing and would push off the need for more runs for awhile.

Getting the ptII build to work on multiple took awhile to figure out. https://github.com/icyphy/ptII/blob/master/.travis.yml and https://github.com/icyphy/ptII/blob/master/bin/ptIITravisBuild.sh would be worth a read.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/icyphy/lingua-franca/issues/216#issuecomment-698704984, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYD47FVCPSU62AA6LDC3XTSHQHW3ANCNFSM4RY6EE6Q .

--

-- Marten Lohstroh, MSc. | Ph.D. Candidate University of California | 545Q Cory Hall Berkeley, CA 94720 | +1 510 282 9135

cmnrd commented 4 years ago

While we are talking about CI, I also want to ask: Is there is specific reason for using Travis? We should really rethink our approach at CI because we have quite heterogeneous needs. We claim to support multiple platforms, but yet tests only run on Linux (see #204). Also, it would only make sens to use various compilers and compiler versions for the C++ target (and probably also for C). For python there are also multiple versions and probably the same holds for Type Script. I think @cxbrooks' suggestion to split runs up is very helpful and would provide us with a more fine-grained control of our tests.

I am asking why Travis because there are many other CI solutions and I believe we should carefully consider our requirements and make a choice based on them. One option that I have good experience with are GitHub actions (I am using them for reactor-cpp). Their main advantage is that they integrate nicely with GitHub. Also, they are free for open source projects and, from what I have seen, easy to configure.

cxbrooks commented 4 years ago

Looking at other CI solutions is never wrong.

Travis supports multiple platforms (Linux, Mac, Windows), see https://docs.travis-ci.com/user/reference/overview/ It looks like other platforms might be available, see https://docs.travis-ci.com/user/multi-cpu-architectures.

With Ptolemy II, we found a number of bugs by running tests on multiple platforms.

It looks like Travis supports multiple Python versions under Linux, see https://docs.travis-ci.com/user/languages/python/

Under macOS or Windows, one would probably need to install Python separately and then cache it. Caches are used with the ptII build.

For different versions of C compilers, see https://docs.travis-ci.com/user/languages/c/#choosing-compilers-to-test-against

One reason I went with Travis for ptII was because Travis was free. I prefer Jenkins, see https://wiki.eecs.berkeley.edu/ptexternal/Main/Travis#FAQ_2

Why can't we use Jenkins? Jenkins has much better JUnit summary facilities, but Jenkins requires that we support a machine for building, whereas Travis is free.

At the time of the move to Travis, Github actions was not available. Coming up with a list of CI requirements and doing an analysis is probably a good idea.

One thing is that managing CI is a bit of a do-ocracy. A do-ocracy is the situation where the person(s) doing the work get quite a bit of input in to the decisions, whereas people doing less or none of the work have less say. I went with Travis because it was COTS, at the time the it was most common, it reduced our costs and I was the one supporting CI :-)

One reason not to go with Github is to avoid single vendor lock in. Having everything in Github might be a risk. Version control comes and goes, remember SCCS, RCS, and CVS. Also vendors come and go, Sourceforge was the choice for awhile, then they went sort of evil. The Eclipse organization is starting to support Gitlab in house in addition to using Github externally.

https://gitlab.com/gitlab-org/gitlab/-/issues/195865:

Currently, Eclipse projects can host either at GitHub or on our in-house Git/Gerrit/Bugzilla solution. Our plans are to eventually deprecate our in-house solution in favour of GitLab.

About compiler and Python versions, if the code requires a specific version, then you are probably doing it wrong :-) Or, there should be very good reasons as to why there is version-specific code and that code should be optional and easy to exclude. Beware the magpie syndrome. Writing portable code increases the likelihood of reuse and impact. Testing on multiple versions of tools is one way to help ensure portability.

Anyway, just some thoughts. The person(s) managing CI should have lots of input ...

Soroosh129 commented 4 years ago

Thank you for all the pointers. I have taken @cxbrooks original advice of using Travis's Build Matrix system for now because this was a more immediate concern for the Python branch (all tests started to fail due to timeout).

It was actually quite easy to do. You can see the new .travis.yml here.

The tests for C, TS, and Python finished under about 12 minutes and Cpp took about 25 minutes. You can see a report of that build here.

Soroosh129 commented 4 years ago

I do agree that we need a more comprehensive testing mechanism that tests in other operating systems and using different compiler versions. I can see that already the Cpp and TS targets are failing on Windows. Robustness would mean wider adoptability.

lhstrh commented 4 years ago

Closing this because the recent changes successfully addressed the timeout issue.

lf-lang / lingua-franca

Travis CI Timeout #216