Closed schnittchen closed 7 years ago
01 cd dummy1_app_path; for i in {1..10}; do bin/dummy1 ping && break || true; sleep 1; done
01 Node 'dummy1@127.0.0.1' not responding to pings.
â 01 user@localhost 1.350s
02 bin/dummy1 rpcterms Elixir.Edeliver run_command '[monitor_startup_progress, "dummy1", verbose].' | tee /dev/fd/2 | grep -e 'Started\|^ok'
02 Node is not running!
cap aborted!
(in 1) test full deploy of a release (SimpleAppTest))
I really don't know what's going on here. Last failing build https://travis-ci.org/schnittchen/carafe/builds/233257420 shows that bin/script start
always has no output, but subsequent ping fails rather often.
It looks like killing stray processes is broken.
Seen in the wild: Protocol: ~tp: the name dummy1@127.0.0.1 seems to be in use by another Erlang node
The problem occurs when all tests except "test full deploy of a release (SimpleAppTest)" have been skipped.
It looks like 4906c3d (https://travis-ci.org/schnittchen/carafe/builds/239129001) is reproducably passing and 15f84e8 (https://travis-ci.org/schnittchen/carafe/builds/236719232) reproducably failing, even though git diff 4906c3d 15f84e8
produces no output.
And suddenly 15f84e8 is green. Now rebuilding https://travis-ci.org/schnittchen/carafe/builds/233003703 as well, will it turn green too?
Dang. Sure it did.
The problems disappeared after switching to CircleCi
Closing for now. Since this is extremely unlikely to happen on a real-life deploy, knowing that it fails under some unknown conditions in a CI no longer being used is not of value.
We need some output here to see what's going on.