Open roed314 opened 2 weeks ago
@edgarcosta Can we use threaded gunicorn workers as described here, or will the Sage/Pari threading bug kill us in that case?
Correct:
worker_class = 'gthread'
threads = 1
leads to
CRITICAL:concurrent.futures@2024-11-13 17:27:10,396: Exception in worker
Traceback (most recent call last):
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/concurrent/futures/thread.py", line 83, in _worker
work_item.run()
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/concurrent/futures/thread.py", line 60, in run
self.future.set_exception(exc)
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/concurrent/futures/_base.py", line 555, in set_exception
self._invoke_callbacks()
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/concurrent/futures/_base.py", line 330, in _invoke_callbacks
callback(self)
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/site-packages/gunicorn/workers/gthread.py", line 249, in finish_request
(keepalive, conn) = fs.result()
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/site-packages/gunicorn/workers/gthread.py", line 282, in handle
keepalive = self.handle_request(req, conn)
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/site-packages/gunicorn/workers/gthread.py", line 334, in handle_request
respiter = self.wsgi(environ, resp.start_response)
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/site-packages/flask/app.py", line 2548, in __call__
return self.wsgi_app(environ, start_response)
File "/home/lmfdb/lmfdb-git-olive/lmfdb/app.py", line 42, in __call__
return self.app(environ, start_response)
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/site-packages/flask/app.py", line 2525, in wsgi_app
response = self.full_dispatch_request()
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/site-packages/flask/app.py", line 1820, in full_dispatch_request
rv = self.dispatch_request()
File "/home/sage/sage-9.7/local/var/lib/sage/venv-python3.10.5/lib/python3.10/site-packages/flask/app.py", line 1796, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "/home/lmfdb/lmfdb-git-olive/lmfdb/classical_modular_forms/main.py", line 537, in by_url_newform_label
return render_newform_webpage(label)
File "/home/lmfdb/lmfdb-git-olive/lmfdb/classical_modular_forms/main.py", line 381, in render_newform_webpage
friends=newform.friends,
File "/home/lmfdb/lmfdb-git-olive/lmfdb/classical_modular_forms/web_newform.py", line 383, in friends
if self.embedding_label is None and len(self.conrey_orbit) * self.rel_dim > 50:
File "/home/lmfdb/lmfdb-git-olive/lmfdb/classical_modular_forms/web_newform.py", line 319, in conrey_orbit
return ConreyCharacter(self.level,self.conrey_index).galois_orbit()
File "/home/lmfdb/lmfdb-git-olive/lmfdb/characters/TinyConrey.py", line 108, in __init__
self.G = pari("znstar({},1)".format(modulus))
File "cypari2/pari_instance.pyx", line 841, in cypari2.pari_instance.Pari.__call__
File "cypari2/gen.pyx", line 4813, in cypari2.gen.objtogen
File "cypari2/convert.pyx", line 557, in cypari2.convert.PyObject_AsGEN
cysignals.signals.SignalError: Segmentation fault
In #5293 we made downloads asynchronous, but they are still limited by our 30 second gunicorn timeout. We currently prohibit downloads if we estimate that they will be larger than 100MB, but downloads smaller than this can still fail (for example, https://www.lmfdb.org/NumberField/?discriminant=-1000000-1000000 will probably fail unless you have a faster internet connection than @AndrewVSutherland or I).
In the PR #5702, we tried to figure out how to avoid the timeout, but did not succeed (possibly because of an interaction with the Sage/Pari threading problem, I don't remember). @edgarcosta points out that there may be some useful advice for us here.
If we do manage to undo the technical limit on large downloads, there's still a question of whether we want to impose a policy limit (due to bandwidth costs of people downloading huge amounts of data for example).