Open craig-willis opened 6 years ago
Timeout error
was due to incorrect queue specified during manual restart of celery_worker. I used worker
but needed to be celery
. Also, actual error message was in internal log file in girder
container (/root/.girder/logs).
Traceback (most recent call last):
File "/girder/girder/api/rest.py", line 620, in endpointDecorator
val = fun(self, args, kwargs)
File "/girder/girder/api/rest.py", line 1204, in POST
return self.handleRoute(method, path, params)
File "/girder/girder/api/rest.py", line 947, in handleRoute
val = handler(**kwargs)
File "/girder/girder/api/access.py", line 63, in wrapped
return fun(*args, **kwargs)
File "/girder/girder/api/describe.py", line 702, in wrapped
return fun(*args, **kwargs)
File "/girder/plugins/wholetale/server/rest/instance.py", line 166, in createInstance
save=True)
File "/girder/plugins/wholetale/server/models/instance.py", line 147, in createInstance
volume = volumeTask.get(timeout=TASK_TIMEOUT)
File "/usr/local/lib/python3.5/dist-packages/celery/result.py", line 191, in get
on_message=on_message,
File "/usr/local/lib/python3.5/dist-packages/celery/backends/async.py", line 188, in wait_for_pending
for _ in self._wait_for_pending(result, **kwargs):
File "/usr/local/lib/python3.5/dist-packages/celery/backends/async.py", line 259, in _wait_for_pending
raise TimeoutError('The operation timed out.')
celery.exceptions.TimeoutError: The operation timed out.
After restarting the workers with correct queue, things are working.
Separate issue now after launching the tale:
docker service ps tmp-k8evq397tylo --no-trunc
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
gx9lt9w7jvch838qiw7hsedbv tmp-k8evq397tylo.1 registry.stage.wholetale.org/5964d96e1801c10001061e49 wt-stage-01 Ready Rejected 2 seconds ago "No such image: registry.stage.wholetale.org/5964d96e1801c10001061e49:latest"
Do we need to migrate the registry?
Migrate or trigger build for all images.
...or make plugin do it if the image is not there, although that will significantly increase deployment time to staging each time
could not be launched ! TimeoutError: TimeoutError('The operation timed out.)
errors. Need operational documentation with common errors/troubleshooting/resolution.