NASA-Tensegrity-Robotics-Toolkit / NTRTsim

The NASA Tensegrity Robotics Toolkit Simulator, a physics based simulator to research the design and control of tensegrity robots.
Apache License 2.0
161 stars 81 forks source link

Automated Integration Tests #140

Open brtietz opened 9 years ago

brtietz commented 9 years ago

Recent commits broke the integration tests on master (handling the fix off list). In order to make those tests a little more useful, would it be possible for buildbot to run them on a regular interval? I understand every build is too frequent, but maybe within 24 hours of a new build (or every 24 hours if that's easier)? Currently they take 30 - 45 seconds to run, depending on the machine.

One of the tests needs to be updated to the lastest data before it passes, I'll take care of that this week.

PerryBhandal commented 9 years ago

Certainly can.

Let's add a nightly build which does a full build of NTRT source and its libraries (i.e. no cached boost/bullet) then run all unit tests and integration tests. Ideally we should have this run for both OS X and Linux.

I'll try to get the nightly builds that just cover Linux and doing integration tests by this weekend, then when I have more free time (and if we run into further issues where we have OS specific bugs like we're currently seeing) I can add an OS X nightly build.

vsunspiral commented 9 years ago

thanks! That will be really helpful!

vytas

On Mar 17, 2015, at 10:42 AM, Perry Bhandal notifications@github.com wrote:

Certainly can.

Let's add a nightly build which does a full build of NTRT source and its libraries (i.e. no cached boost/bullet) then run all unit tests and integration tests. Ideally we should have this run for both OS X and Linux.

I'll try to get the nightly builds that just cover Linux and doing integration tests by this weekend, then when I have more free time (and if we run into further issues where we have OS specific bugs like we're currently seeing) I can add an OS X nightly build.

— Reply to this email directly or view it on GitHub.


Vytas SunSpiral
Dynamic Tensegrity Robotics Lab cell- 510-847-4600 Office: 650-604-4363 N269 Rm. 100

Stinger Ghaffarian Technologies Intelligent Robotics Group NASA Ames Research Center

I will not tiptoe cautiously through life only to arrive safely at death.

PerryBhandal commented 9 years ago

I ended up adding the integration tests to the normal on commit test. The current build takes a couple of minutes on its own, so 30-45 seconds more won't be too big of a deal.

If down the road the integration tests take too long to build/run (~5 minutes+) we can move them into a nightly build.

Brian, just to be sure, I'm assuming your fix for the integration tests hasn't been pushed into the master yet, correct? Now that integration tests are included, builds are failing (http://ntrt.perryb.ca/bb/builders/master/builds/265). If your fixes have been pushed, let me know -- likely means there's a problem in the way I've configured BuildBot.

brtietz commented 9 years ago

Actually, that was merged in as a part of pull request #141. I'm guessing this is a buildbot error as you suggest, as it runs cleanly for me, and valgrind only returns a loss on the controller (as expected and discussed in #19): https://gist.github.com/brtietz/03aaa87ce8dbdf48499d

On Sun, Mar 22, 2015 at 1:34 AM, Perry Bhandal notifications@github.com wrote:

I ended up adding the integration tests to the normal on commit test. The current build takes a couple of minutes on its own, so 30-45 seconds more won't be too big of a deal.

If down the road the integration tests take too long to build/run (~5 minutes+) we can move them into a nightly build.

Brian, just to be sure, I'm assuming your fix for the integration tests hasn't been pushed into the master yet, correct? Now that integration tests are included, builds are failing ( http://ntrt.perryb.ca/bb/builders/master/builds/265). If your fixes have been pushed, let me know -- likely means there's a problem in the way I've configured BuildBot.

— Reply to this email directly or view it on GitHub https://github.com/NASA-Tensegrity-Robotics-Toolkit/NTRTsim/issues/140#issuecomment-84558162 .

PerryBhandal commented 9 years ago

Oh, interesting. I'll try to get that sorted out today. Thanks Brian.

PerryBhandal commented 9 years ago

The specific error that was raised during the test is

* Error in `./SpineTests/WorldConf_Spines_test': double free or corruption (fasttop): 0x00000000019e96b0 * Aborted

This, as the path implies, occurred while running integration tests. The specific test is WorldConf_Spines. The full log of that step can be found here:

http://ntrt.perryb.ca/bb/builders/master/builds/266/steps/Build%20and%20run%20tests/logs/stdio

The problem was resolved by re-compiling boost and bullet from scratch, then re-attempting the on commit compile so that it used the newly cached versions. I don't know what specifically caused it.

I doubt it's worth anyone's time to try and sort out what occurred and why. Nonetheless, I've put a copy of the cached bullet/boost that BuildBot was using at the link below in the event it's needed.

http://ntrt.perryb.ca/ss/3.24.15-double_free_or_corruption.tar.gz

To avoid problems of this nature in the future, we should do the following.

Have the nightly build cache its compiled boost and bullet if the build succeeds. That will allow us to handle any breaking builds with much less fuss. It'll still require manual reconfiguration any time we change env's structure, but that doesn't happen very often.

When a developer makes a commit that modifies bullet and/or boost in a way that breaks the build (as confirmed by a failing on commit test), they should log in to the BuildBot admin and force a nightly build to occur. Once the forced nightly build is complete, it should first back up the existing cached boost/bullet (we'll want to keep a sliding window of the last x cache backups [five seems like a reasonable number] so any unexpected cached boost/bullet failures can be researched, if desired) and then replace it with the new build. This should be done as two separate atomic moves (cached -> backup, new -> cached) to reduce the risk that a dev attempts a build during replacement which fails unexpectedly.

I'll take care of this change later in the week. That should provide some time in case anyone has comments. This will necessitate adding the nightly build as originally discussed, even so, I may leave integration tests in the on-commit build as well given that they only take 30-45 seconds.

Perry

vsunspiral commented 9 years ago

thanks for figuring this out Perry and helping to maintain such an awesome build and test system!!!

vytas

On Mar 24, 2015, at 12:27 AM, Perry Bhandal notifications@github.com wrote:

The specific error that was raised during the test is

* Error in `./SpineTests/WorldConf_Spines_test': double free or corruption (fasttop): 0x00000000019e96b0 * Aborted

This, as the path implies, occurred while running integration tests. The specific test is WorldConf_Spines. The full log of that step can be found here:

http://ntrt.perryb.ca/bb/builders/master/builds/266/steps/Build%20and%20run%20tests/logs/stdio

The problem was resolved by re-compiling boost and bullet from scratch, then re-attempting the on commit compile so that it used the newly cached versions. I don't know what specifically caused it.

I doubt it's worth anyone's time to try and sort out what occurred and why. Nonetheless, I've put a copy of the cached bullet/boost that BuildBot was using at the link below in the event it's needed.

http://ntrt.perryb.ca/ss/3.24.15-double_free_or_corruption.tar.gz

To avoid problems of this nature in the future, we should do the following.

Have the nightly build cache its compiled boost and bullet if the build succeeds. That will allow us to handle any breaking builds with much less fuss. It'll still require manual reconfiguration any time we change env's structure, but that doesn't happen very often.

When a developer makes a commit that modifies bullet and/or boost in a way that breaks the build (as confirmed by a failing on commit test), they should log in to the BuildBot admin and force a nightly build to occur. Once the forced nightly build is complete, it should first back up the existing cached boost/bullet (we'll want to keep a sliding window of the last x cache backups [five seems like a reasonable number] so any unexpected cached boost/bullet failures can be researched, if desired) and then replace it with the new build. This should be done as two separate atomic moves (cached -> backup, new -> cached) to reduce the risk that a dev attempts a build during replacement which fails unexpectedly.

I'll take care of this change later in the week. That should provide some time in case anyone has comments. This will necessitate adding the nightly build as originally discussed, even so, I may leave integration tests in the on-commit build as well given that they only take 30-45 seconds.

Perry

— Reply to this email directly or view it on GitHub.


Vytas SunSpiral
Dynamic Tensegrity Robotics Lab cell- 510-847-4600 Office: 650-604-4363 N269 Rm. 100

Stinger Ghaffarian Technologies Intelligent Robotics Group NASA Ames Research Center

I will not tiptoe cautiously through life only to arrive safely at death.