huggingface / dataset-viewer

Backend that powers the dataset viewer on Hugging Face dataset pages through a public API.
https://huggingface.co/docs/dataset-viewer
Apache License 2.0
695 stars 79 forks source link

Duckdb Con - Error Invalid Input Error: Cannot change configuration option "extension_directory" - the configuration has been locked #2682

Open AndreaFrancis opened 6 months ago

AndreaFrancis commented 6 months ago

After some time, the following error appears for some search calls:

NFO:     10.0.15.198:45176 - "OPTIONS /search?dataset=jp1924%2FVisualQuestionAnswering&config=default&split=train&offset=0&length=100&query=%EC%B4%88%EB%A1%9D%EC%83%89 HTTP/1.1" 200 OK
INFO: 2024-04-08 17:02:13,422 - root - /search dataset='jp1924/VisualQuestionAnswering' config='default' split='train' query='초록색' offset=0 length=100
ERROR: 2024-04-08 17:02:13,427 - root - Unexpected error.
Traceback (most recent call last):
  File "/src/services/search/src/search/routes/search.py", line 209, in search_endpoint
    num_rows_total, pa_table = await anyio.to_thread.run_sync(
  File "/src/services/search/.venv/lib/python3.9/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/src/services/search/.venv/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/src/services/search/.venv/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/src/services/search/src/search/routes/search.py", line 68, in full_text_search
    with duckdb_connect(extensions_directory=extensions_directory, database=index_file_location) as con:
  File "/src/services/search/src/search/duckdb_connection.py", line 12, in duckdb_connect
    con.execute(SET_EXTENSIONS_DIRECTORY_COMMAND.format(directory=extensions_directory))
duckdb.duckdb.InvalidInputException: Invalid Input Error: Cannot change configuration option "extension_directory" - the configuration has been locked
INFO:     10.0.29.97:35770 - "GET /search?dataset=jp1924%2FVisualQuestionAnswering&config=default&split=train&offset=0&length=100&query=%EC%B4%88%EB%A1%9D%EC%83%89 HTTP/1.1" 500 Internal Server Error
INFO: 2024-04-08 17:02:13,429 - root - /healthcheck
INFO: 2024-04-08 17:02:13,429 - root - /healthcheck
AndreaFrancis commented 6 months ago

Looks like it is an issue with duckdb con itself; it also happened for filter: image

github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.