kasnerz / factgenie

A Toolkit for Annotating and Visualizing LLM Hallucinations
MIT License
1 stars 0 forks source link

Pausing LLM evaluation causes bug #22

Open oplatek opened 2 weeks ago

oplatek commented 2 weeks ago
2024-06-21 14:46:55 INFO llm-eval-1: 36/492 examples
2024-06-21 14:46:55 ERROR Exception on /llm_eval/run [POST]
Traceback (most recent call last):
  File "/lnet/express/work/people/oplatek/factgenie/venv/lib/python3.11/site-packages/flask/app.py", line 1473, in wsgi_app
    response = self.full_dispatch_request()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lnet/express/work/people/oplatek/factgenie/venv/lib/python3.11/site-packages/flask/app.py", line 882, in full_dispatch_request
    rv = self.handle_user_exception(e)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lnet/express/work/people/oplatek/factgenie/venv/lib/python3.11/site-packages/flask/app.py", line 880, in full_dispatch_request
    rv = self.dispatch_request()
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/lnet/express/work/people/oplatek/factgenie/venv/lib/python3.11/site-packages/flask/app.py", line 865, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lnet/express/work/people/oplatek/factgenie/factgenie/main.py", line 494, in llm_eval_run
    return utils.run_llm_eval(app, campaign_id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lnet/express/work/people/oplatek/factgenie/factgenie/utils.py", line 433, in run_llm_eval
    if db.status.unique() == "finished":
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
2024-06-21 14:46:55 INFO 10.10.24.254 - - [21/Jun/2024 14:46:55] "POST /llm_eval/run HTTP/1.1" 500 -