huggingface / dataset-viewer

Lightweight web API for visualizing and exploring any dataset - computer vision, speech, text, and tabular - stored on the Hugging Face Hub
https://huggingface.co/docs/datasets-server
Apache License 2.0
639 stars 65 forks source link

persisting CreateCommitError #2766

Open severo opened 2 months ago

severo commented 2 months ago

For dataset https://huggingface.co/datasets/venetis/VMMRdb_make_model_test, we get the same error after 30 retries: CreateCommitError. Another one: https://huggingface.co/datasets/celsowm/stack-exchange-paired-mini-1k

Also, two other datasets have the CreateCommitError for more than 1 month, so it does not seem to be the same issue:

severo commented 2 months ago

see https://github.com/huggingface/dataset-viewer/pull/2758#issuecomment-2090313327

severo commented 2 months ago

Same for error LockedDatasetTimeoutError: 6 entries are never retried, from more than one month. I'm not sure why they are not removed during the daily backfill.

severo commented 2 months ago

For https://huggingface.co/datasets/re-align/UnifiedChat, for example, we have entries, including an error with LockedDatasetTimeoutError, for config default. But this config does not exist anymore:

Capture d’écran 2024-05-02 à 14 59 02

So, the issue seems to be in the backfill process: we are not checking if cache entries exist when they should be deleted.

It's a different issue, so I opened https://github.com/huggingface/dataset-viewer/issues/2767

severo commented 2 months ago

For https://huggingface.co/datasets/venetis/VMMRdb_make_model_test, the traceback is:

{
    "error": "Commit 0/1 could not be created on the Hub (after 6 attempts).",
    "cause_exception": "BadRequestError",
    "cause_message": " (Request ID: Root=1-6634b9f0-49d2d36a185cf9530aac8e1f;1742b1b1-75f5-45d7-99da-95a638b09d29)\n\nBad request for commit endpoint:\nYour push was rejected because an LFS pointer pointed to a file that does not exist. For instance, this can happen if you used git push --no-verify to push your changes. Offending file: - default/train/0000.parquet",
    "cause_traceback": [
        "Traceback (most recent call last):\n",
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status\n response.raise_for_status()\n',
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status\n raise HTTPError(http_error_msg, response=self)\n',
        "requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://huggingface.co/api/datasets/venetis/VMMRdb_make_model_test/commit/refs%2Fconvert%2Fparquet\n",
        "\nThe above exception was the direct cause of the following exception:\n\n",
        "Traceback (most recent call last):\n",
        ' File "/src/libs/libcommon/src/libcommon/utils.py", line 183, in decorator\n return func(*args, **kwargs)\n',
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn\n return fn(*args, **kwargs)\n',
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 1230, in _inner\n return fn(self, *args, **kwargs)\n',
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 3812, in create_commit\n hf_raise_for_status(commit_resp, endpoint_name="commit")\n',
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 358, in hf_raise_for_status\n raise BadRequestError(message, response=response) from e\n',
        "huggingface_hub.utils._errors.BadRequestError: (Request ID: Root=1-6634b9f0-49d2d36a185cf9530aac8e1f;1742b1b1-75f5-45d7-99da-95a638b09d29)\n\nBad request for commit endpoint:\nYour push was rejected because an LFS pointer pointed to a file that does not exist. For instance, this can happen if you used git push --no-verify to push your changes. Offending file: - default/train/0000.parquet\n",
        "\nThe above exception was the direct cause of the following exception:\n\n",
        "Traceback (most recent call last):\n",
        ' File "/src/services/worker/src/worker/job_runners/config/parquet_and_info.py", line 1003, in create_commits\n commit_info = retry_create_commit(\n',
        ' File "/src/libs/libcommon/src/libcommon/utils.py", line 188, in decorator\n raise RuntimeError(f"Give up after {attempt} attempts. The last one raised {type(last_err)}") from last_err\n',
        "RuntimeError: Give up after 6 attempts. The last one raised <class 'huggingface_hub.utils._errors.BadRequestError'>\n",
    ],
}

Hence, the specific error is:

huggingface_hub.utils._errors.BadRequestError: (Request ID: Root=1-6634b9f0-49d2d36a185cf9530aac8e1f;1742b1b1-75f5-45d7-99da-95a638b09d29)

Bad request for commit endpoint:
Your push was rejected because an LFS pointer pointed to a file that does not exist. For instance, this can happen if you used git push --no-verify to push your changes. Offending file: - default/train/0000.parquet

The dataset only has one data file (Parquet): https://huggingface.co/datasets/venetis/VMMRdb_make_model_test/tree/main/data

The current content of the refs/convert/parquet branch is:

Capture d’écran 2024-05-03 à 12 21 39

Do you have an idea of what can be occurring @lhoestq?