MolSSI / QCFractal

A distributed compute and database platform for quantum chemistry.
https://molssi.github.io/QCFractal/
BSD 3-Clause "New" or "Revised" License
143 stars 47 forks source link

Compute managers cause SQL error if an executor has no programs #807

Closed bennybp closed 3 months ago

bennybp commented 3 months ago

Compute managers are required to activate with a non-zero number of available programs. However, if there are two executors, and one does not have any available programs, the server will error when that executor tries to claim tasks.

There should be a check that every executor has at least one available program, and a better check should be done server side.

Additional context

SQL Error below

Traceback (most recent call last):
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1970, in _exec_single_context
    self.dialect.do_execute(
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 924, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.IndeterminateDatatype: cannot determine type of empty array
LINE 7: WHERE base_record.status = 'waiting' AND ARRAY[] @> task_que...
                                                 ^
HINT:  Explicitly cast to the desired type, for example ARRAY[]::integer[].

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/qcarchive/.local/lib/python3.11/site-packages/flask/app.py", line 1463, in wsgi_app
    response = self.full_dispatch_request()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/flask/app.py", line 872, in full_dispatch_request
    rv = self.handle_user_exception(e)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/flask/app.py", line 870, in full_dispatch_request
    rv = self.dispatch_request()
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/flask/app.py", line 855, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/qcfractal/flask_app/api_v1/helpers.py", line 95, in wrapper
    ret = fn(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/qcfractal/components/tasks/routes.py", line 21, in claim_tasks_v1
    return storage_socket.tasks.claim_tasks(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/qcfractal/components/tasks/socket.py", line 334, in claim_tasks
    new_items = session.execute(stmt).all()
                ^^^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2306, in execute
    return self._execute_internal(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2191, in _execute_internal
    result: Result[Any] = compile_state_cls.orm_execute_statement(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/orm/context.py", line 293, in orm_execute_statement
    result = conn.execute(
             ^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1421, in execute
    return meth(
           ^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 514, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1643, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1849, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1989, in _exec_single_context
    self._handle_dbapi_exception(
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2356, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1970, in _exec_single_context
    self.dialect.do_execute(
  File "/home/qcarchive/.local/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 924, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.IndeterminateDatatype) cannot determine type of empty array
LINE 7: WHERE base_record.status = 'waiting' AND ARRAY[] @> task_que...
                                                 ^
HINT:  Explicitly cast to the desired type, for example ARRAY[]::integer[].

[SQL: WITH anon_1 AS 
(SELECT base_record_1.id AS record_id, least(base_record_1.created_on, min(base_record_2.created_on)) AS created_on 
FROM base_record AS base_record_1 JOIN service_dependency ON service_dependency.record_id = base_record_1.id JOIN service_queue ON service_queue.id = service_dependency.service_id JOIN base_record AS base_record_2 ON base_record_2.id = service_queue.record_id 
WHERE base_record_1.status = %(status_1)s GROUP BY base_record_1.id ORDER BY created_on ASC)
 SELECT task_queue.id, task_queue.function, task_queue.function_kwargs_compressed, task_queue.required_programs, task_queue.tag, task_queue.priority, task_queue.record_id, base_record.record_type, base_record.id AS id_1, base_record.status, base_record.manager_name, base_record.modified_on 
FROM task_queue JOIN base_record ON base_record.id = task_queue.record_id LEFT OUTER JOIN anon_1 ON anon_1.record_id = base_record.id 
WHERE base_record.status = %(status_2)s AND ARRAY[] @> task_queue.required_programs ORDER BY task_queue.priority DESC, least(base_record.created_on, anon_1.created_on) ASC 
 LIMIT %(param_1)s FOR UPDATE OF base_record, task_queue SKIP LOCKED]
[parameters: {'status_1': 'waiting', 'status_2': 'waiting', 'param_1': 144}]
(Background on this error at: https://sqlalche.me/e/20/f405)