buildstream-migration / bst-staging

GNU Lesser General Public License v2.1
0 stars 0 forks source link

Hanging tests due to exceptions in artifact cache #811

Open Cynical-Optimist opened 4 years ago

Cynical-Optimist commented 4 years ago

See original issue on GitLab In GitLab by [Gitlab user @jmacarthur] on Dec 7, 2018, 10:18

Summary

Likely only to affect developers but very annoying. Exceptions in the Artifact cache cause the BuildStream tests to lock up. They cannot be interrupted with ^C and need a kill -9 in another terminal to kill them.

Steps to reproduce

Simulate a missing file in CASCache._get_subdir, like this:

diff --git a/buildstream/_artifactcache/cascache.py b/buildstream/_artifactcache/cascache.py
index 9ca757d4..29cb84a0 100644
--- a/buildstream/_artifactcache/cascache.py
+++ b/buildstream/_artifactcache/cascache.py
[[Gitlab user @]](https://gitlab.com/)[[Gitlab user @]](https://gitlab.com/) -797,6 +797,7 [[Gitlab user @]](https://gitlab.com/)[[Gitlab user @]](https://gitlab.com/) class CASCache():

     def _get_subdir(self, tree, subdir):
         head, name = os.path.split(subdir)
+        raise CASError("Subdirectory {} not found".format(name))
         if head:
             tree = self._get_subdir(tree, head)

(This exception is raised at the end of this function, and can genuinely be raised if an badly-constructed artifact was placed in the cache.)

Now run the test:

/setup.py test --addopts "tests/artifactcache/pull.py::test_pull --integration -s"

This will halt at 'pull'.

What is the current bug behavior?

Tests lock up and can only be cleared with SIGKILL.

What is the expected correct behavior?

Details of the exception being raised are visible to the tester.

Possible fixes

While we can't simply remove it, ExitStack in tests/testutils/runcli.py is likely to be relevant. In my case, I tracked down the underlying exception by removing ExitStack from Cli.run temporarily. This allows the exception text and backtrace to appear on the test output.


Cynical-Optimist commented 4 years ago

In GitLab by [Gitlab user @tristanvb] on Dec 16, 2018, 10:27

Note my recent comment about unhandled exceptions occurring in event callbacks: https://gitlab.com/BuildStream/buildstream/issues/420#note_125308148

Is this related ? is it possible that we are calling into the artifact cache code outside of the try/except blocks after a job completes, leading to an indefinite hang ?