Closed kevinkle closed 6 years ago
https://github.com/superphy/backend/commit/8adb0648ca715844fc881c4074912fd3ad52fa47 looks like the wrapper causes the enqueue call in spfy.py to try and enqueue the return from the database upload. Making changes.
<?xml version="1.0"?><data modified="11355" milliseconds="2930"/>() from blazegraph_uploads24972297-151e-45d7-bccf-f6e33b744125Failed 4 hours agoTraceback (most recent call last): File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/worker.py", line 700, in perform_job rv = job.perform() File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 500, in perform self._result = self.func(*self.args, **self.kwargs) File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 206, in func return import_attribute(self.func_name) File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/utils.py", line 150, in import_attribute module = importlib.import_module(module_name) File "/opt/conda/envs/backend/lib/python2.7/importlib/__init__.py", line 37, in import_module __import__(name) ImportError: No module named <?xml version="1 | 4 hours ago | Requeue Cancel
-- | -- | --
<?xml version="1.0"?><data modified="1493" milliseconds="1224"/>() from blazegraph_uploads81ba31d9-fb7c-417d-8436-885e1fcd716dFailed 4 hours agoTraceback (most recent call last): File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/worker.py", line 700, in perform_job rv = job.perform() File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 500, in perform self._result = self.func(*self.args, **self.kwargs) File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 206, in func return import_attribute(self.func_name) File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/utils.py", line 150, in import_attribute module = importlib.import_module(module_name) File "/opt/conda/envs/backend/lib/python2.7/importlib/__init__.py", line 37, in import_module __import__(name) ImportError: No module named <?xml version="1 | 4 hours ago | Requeue Cancel
<?xml version="1.0"?><data modified="13915" milliseconds="3280"/>() from blazegraph_uploadse6219706-5d43-4bb4-917c-18fdc7ebe579Failed 18 minutes agoTraceback (most recent call last): File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/worker.py", line 700, in perform_job rv = job.perform() File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 500, in perform self._result = self.func(*self.args, **self.kwargs) File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 206, in func return import_attribute(self.func_name) File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/utils.py", line 150, in import_attribute module = importlib.import_module(module_name) File "/opt/conda/envs/backend/lib/python2.7/importlib/__init__.py", line 37, in import_module __import__(name) ImportError: No module named <?xml version="1 | 18 minutes ago | Requeue Cancel
<?xml version="1.0"?><data modified="1363" milliseconds="788"/>() from blazegraph_uploads4232449e-ddc9-4596-8019-d2d9dd61109fFailed 15 minutes agoTraceback (most recent call last): File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/worker.py", line 700, in perform_job rv = job.perform() File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 500, in perform self._result = self.func(*self.args, **self.kwargs) File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 206, in func return import_attribute(self.func_name) File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/utils.py", line 150, in import_attribute module = importlib.import_module(module_name) File "/opt/conda/envs/backend/lib/python2.7/importlib/__init__.py", line 37, in import_module __import__(name) ImportError: No module named <?xml version="1
Right now, every call of
datastruct_savvy()
callsupload_graph()
separately; with a large number of workers, this might be causing Blazegraph to hang up when running in corefacility.The way to solve this would be to merge a few of the current queues:
priority
is currently used to run blazegraph queries for the frontendblazegraph
is currently used to reserve spfyids for uploaded filesmultiples
(for RGI) andsingles
(for ECTyper) can each invoke theupload_graph()
function and cause simultaneous uploading of result graphs.There are a number of permutations for this, but for now I'm going to try and just group
3.
into their own queue. This is because2.
is fairly valuable since all tasks are dependent on it, thus we want to keep it separate. Ideally, by merging3.
and only having one worker on it, we can avoid overloading Blazegraph.Few approaches to do this:
datastruct_savvy()
task as the end task. (Again, still waiting on multi-job deps https://github.com/nvie/rq/pull/856)I'm going to go with
2.
as it will be fast to dev. and test this theory; we can also use the decorators to eventually build full job classes.