These collisions happen b.c. the caching can happen in multiple threads. Without a total redesign, can we make this more robust?
Here is a symptom:
[2019-10-03 06:37:28 +0000] [2771] [ERROR] Socket error processing request.
Traceback (most recent call last):
File "/home/cacher/query.py", line 111, in query
result = fetch_data(a, local_cache_item_dir)
File "</usr/local/lib/python3.6/site-packages/decorator.py:decorator-gen-4>", line 2, in fetch_data
File "/usr/local/lib/python3.6/site-packages/retry/api.py", line 74, in retry_decorator
logger)
File "/usr/local/lib/python3.6/site-packages/retry/api.py", line 33, in __retry_internal
return f()
File "/home/cacher/query.py", line 70, in fetch_data
raise CacheCopyError(f'Failed to copy file {url} to {final_location}.')
query.CacheCopyError: Failed to copy file root://charm.epe.phys.washington.edu:30001//7f1a58ea7b05166884cb54f8907b760a/ANALYSIS_001.root to /cache/7f1a58ea7b05166884cb54f8907b760a/ANALYSIS_001.root.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/sync.py", line 135, in handle
self.handle_request(listener, req, client, addr)
File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/sync.py", line 191, in handle_request
six.reraise(*sys.exc_info())
File "/usr/local/lib/python3.6/site-packages/gunicorn/six.py", line 625, in reraise
raise value
File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/sync.py", line 176, in handle_request
respiter = self.wsgi(environ, resp.start_response)
File "/usr/local/lib/python3.6/site-packages/hug/api.py", line 497, in api_auto_instantiate
return module.__hug_wsgi__(*args, **kwargs)
File "falcon/api.py", line 274, in falcon.api.API.__call__
File "falcon/api.py", line 269, in falcon.api.API.__call__
File "/usr/local/lib/python3.6/site-packages/hug/interface.py", line 930, in __call__
raise exception
File "/usr/local/lib/python3.6/site-packages/hug/interface.py", line 901, in __call__
self.call_function(input_parameters), context, request, response, **kwargs
File "/usr/local/lib/python3.6/site-packages/hug/interface.py", line 823, in call_function
return self.interface(**parameters)
File "/usr/local/lib/python3.6/site-packages/hug/interface.py", line 123, in __call__
return __hug_internal_self._function(*args, **kwargs)
File "/home/cacher/query.py", line 114, in query
shutil.rmtree(local_cache_item_dir)
File "/usr/lib64/python3.6/shutil.py", line 490, in rmtree
onerror(os.rmdir, path, sys.exc_info())
File "/usr/lib64/python3.6/shutil.py", line 488, in rmtree
os.rmdir(path)
OSError: [Errno 39] Directory not empty: '/cache/7f1a58ea7b05166884cb54f8907b760a'
These collisions happen b.c. the caching can happen in multiple threads. Without a total redesign, can we make this more robust?
Here is a symptom: