GaloisInc / csaf

Control Systems Analysis Framework - a framework to minimize the effort required to evaluate, implement, and verify controller design (classical and learning enabled) with respect to the system dynamics.
BSD 3-Clause "New" or "Revised" License
11 stars 4 forks source link

Race condition when calling System.unbind() #95

Closed podhrmic closed 3 years ago

podhrmic commented 3 years ago

In GitLab by @zutshi on Jan 12, 2021, 09:36

Removing my_system.unbind() from f16-falsify-bayesian-optimization.ipynb seems to fix the problem. The bug is not easy to re-produce on all local machines, but is quite consistently seen in the CI/CD pipeline.

One of the traces is pasted below.

Running with gitlab-runner 12.7.1 (003fe500)
  on boron-docker A7n7_QPB
Using Docker executor with image galoisinc/csaf:latest ...
00:02
Authenticating with credentials from /root/.docker/config.json
Pulling docker image galoisinc/csaf:latest ...
Using docker image sha256:9e7da76d49b45366a03e0defa84624a28cde7bfcc74b4e5f7ffb364e42b0e717 for galoisinc/csaf:latest ...
Authenticating with credentials from /root/.docker/config.json
00:05
Running on runner-A7n7_QPB-project-802-concurrent-0 via boron...
Authenticating with credentials from /root/.docker/config.json
00:06
Fetching changes with git depth set to 50...
Reinitialized existing Git repository in /builds/assuredautonomy/csaf_architecture/.git/
Checking out 87380170 as subs/OFC...
Removing docs/notebooks/cansat-rejoin.py
Removing examples/cansat/codec/
Removing examples/cansat/components/__pycache__/
Removing examples/cansat/output/
Removing pub-sub-plot.png
Removing src/csaf/__pycache__/
Skipping Git submodules setup
Authenticating with credentials from /root/.docker/config.json
00:04
Authenticating with credentials from /root/.docker/config.json
00:06
Authenticating with credentials from /root/.docker/config.json
00:22
$ which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )
/usr/bin/ssh-agent
$ eval $(ssh-agent -s)
Agent pid 12
$ mkdir -p ~/.ssh
$ chmod 700 ~/.ssh
$ export PYTHONPATH=${PYTHONPATH}:${PWD}/src:${PWD}/examples/f16:${PWD}/examples/inverted-pendulum:${PWD}/examples/rejoin:${PWD}/examples/cansat:/csaf-system
$ echo ">>> Adjust example paths"
>>> Adjust example paths
$ ln -s ${PWD}/examples/f16 /csaf-system
$ echo ">>> Testing ./docs/notebooks/f16-falsify-bayesian-optimization.ipynb"
>>> Testing ./docs/notebooks/f16-falsify-bayesian-optimization.ipynb
$ jupyter nbconvert --to python ./docs/notebooks/f16-falsify-bayesian-optimization.ipynb
[NbConvertApp] Converting notebook ./docs/notebooks/f16-falsify-bayesian-optimization.ipynb to python
[NbConvertApp] Writing 2171 bytes to docs/notebooks/f16-falsify-bayesian-optimization.py
$ ipython ./docs/notebooks/f16-falsify-bayesian-optimization.py
12:50:07 AM: (INFO)  setting up CSAF System from TOML file '/csaf-system/f16_shield_config.toml'
12:50:07 AM: (INFO)  created output directory /csaf-system/output because it did not exist
12:50:07 AM: (INFO)  Output Dir: /builds/assuredautonomy/csaf_architecture/examples/f16/output
12:50:07 AM: (INFO)  Codec Dir: /builds/assuredautonomy/csaf_architecture/examples/f16/codec
12:50:07 AM: (INFO)  Log Level: info
 19%|#9        | 1797/9450 [00:09<00:39, 193.89it/s]
  0%|          | 4/9450 [00:00<00:10, 931.96it/s]---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/builds/assuredautonomy/csaf_architecture/docs/notebooks/f16-falsify-bayesian-optimization.py in <module>
     77 delta = (h - l)/2
     78 attack_spaces = [attack_traj[0]-delta, attack_traj[0]+delta]
---> 79 unsafe_states = attack(EvaluatedFunction(F16SimulationFunction(0, gcas_simulation)), attack_spaces, n_calls=10)
     80 
     81 
/builds/assuredautonomy/csaf_architecture/examples/f16/falsify_bopt.py in attack(func, space, acq_func, n_calls, n_random_starts)
    102                 n_random_starts=n_random_starts,
    103                 noise=0,
--> 104                 n_jobs=1)
    105 
    106     return func.get_collector()
/usr/local/lib/python3.6/dist-packages/skopt/optimizer/gp.py in gp_minimize(func, dimensions, base_estimator, n_calls, n_random_starts, n_initial_points, initial_point_generator, acq_func, acq_optimizer, x0, y0, random_state, verbose, callback, n_points, n_restarts_optimizer, xi, kappa, noise, n_jobs, model_queue_size)
    266         n_restarts_optimizer=n_restarts_optimizer,
    267         x0=x0, y0=y0, random_state=rng, verbose=verbose,
--> 268         callback=callback, n_jobs=n_jobs, model_queue_size=model_queue_size)
/usr/local/lib/python3.6/dist-packages/skopt/optimizer/base.py in base_minimize(func, dimensions, base_estimator, n_calls, n_random_starts, n_initial_points, initial_point_generator, acq_func, acq_optimizer, x0, y0, random_state, verbose, callback, n_points, n_restarts_optimizer, xi, kappa, n_jobs, model_queue_size)
    299     for n in range(n_calls):
    300         next_x = optimizer.ask()
--> 301         next_y = func(next_x)
    302         result = optimizer.tell(next_x, next_y)
    303         result.specs = specs
/builds/assuredautonomy/csaf_architecture/examples/f16/falsify_bopt.py in __call__(self, initial_state)
     67 
     68     def __call__(self, initial_state):
---> 69         ret = self.simu_fn(initial_state)
     70         passed, reward = ret[0:2]
     71         if not passed:
/builds/assuredautonomy/csaf_architecture/examples/f16/falsify_bopt.py in __call__(self, initial_state)
     43 
     44     def __call__(self, initial_state):
---> 45         ret = self.core(initial_state)
     46         _, states, passed = ret[:3]
     47 
/builds/assuredautonomy/csaf_architecture/examples/f16/falsify_bopt.py in <lambda>(x0)
     11 class F16SimulationFunction:
     12     def __init__(self, initial_time, simulator):
---> 13         self.core = lambda x0: simulator(x0, initial_time)
     14         self.initial_time = initial_time
     15         self.last_ps = 0
/builds/assuredautonomy/csaf_architecture/docs/notebooks/f16-falsify-bayesian-optimization.py in gcas_simulation(initial_state, initial_time, tmax)
     56     trajs, passed = my_system.simulate_tspan(tspan, show_status=True,
     57                                 terminating_conditions=ground_collision_condition,
---> 58                                 return_passed=True)
     59     tlen = min(len(trajs["plant"].states), len(trajs["controller"].states))
     60     return trajs["plant"].times, np.hstack((trajs["plant"].states[:tlen], trajs["controller"].states[:tlen])), passed
/builds/assuredautonomy/csaf_architecture/src/csaf/system.py in simulate_tspan(self, tspan, show_status, terminating_conditions, return_passed)
    175             idx = self.names.index(cidx)
    176             self.components[idx].receive_input()
--> 177             out = self.components[idx].send_output()
    178             out["times"] = t
    179             if terminating_conditions is not None and terminating_conditions(cidx, out):
/builds/assuredautonomy/csaf_architecture/src/csaf/dynamics.py in send_output(self, overwrite_buffer)
    112 
    113         # get the input in vector form
--> 114         input_vector = self.input_as_vector()
    115 
    116         # obtain state vector
/builds/assuredautonomy/csaf_architecture/src/csaf/dynamics.py in input_as_vector(self)
    102         if len(self._topics_input) > 0:
    103             for f in self._topics_input:
--> 104                 input_vector += self._input_buffer[f]
    105         return input_vector
    106 
KeyError: 'autoairspeed-outputs'
Authenticating with credentials from /root/.docker/config.json
00:04
ERROR: Job failed: exit code 1
podhrmic commented 3 years ago

In GitLab by @podhrmic on Jul 13, 2021, 14:01

@EthanJamesLewany updates here?

podhrmic commented 3 years ago

In GitLab by @EthanJamesLewon Jul 14, 2021, 18:24

mentioned in merge request !73

podhrmic commented 3 years ago

In GitLab by @podhrmic on Jul 15, 2021, 10:34

mentioned in commit 2363dc0967db5f62c34e89e1a461ef86e01b9ddf