Open JohnGiorgi opened 1 year ago
Hi ! Metric related issues should be posted in the evaluate repository - happy to help from there ;)
Could you try passing download_config=DownloadConfig(use_etag=False)
to datasets.load_metric()
or evaluate.load()
?
You might have this issue because it tried to reach the URL to get the file ETag used by the cache.
No dice, it seems. I tried the following, but it hung and eventually failed in offline mode.
While online:
import evaluate
from datasets import DownloadConfig
from transformers.utils import is_offline_mode
assert not is_offline_mode()
bleurt = evaluate.load("bleurt", "BLEURT-20")
While offline:
import evaluate
from datasets import DownloadConfig
from transformers.utils import is_offline_mode
assert is_offline_mode()
bleurt = evaluate.load("bleurt", "BLEURT-20", download_config=DownloadConfig(use_etag=False))
import evaluate
from datasets import DownloadConfig
from transformers.utils import is_offline_mode
import os
os.environ["HF_DATASETS_OFFLINE"] = "1"
os.environ["TRANSFORMERS_OFFLINE"] = "1"
assert is_offline_mode()
bleurt = evaluate.load("bleurt", "BLEURT-20", download_config=DownloadConfig(use_etag=False))
Any help would be appreciated @lhoestq 😅
Same here.
Same here.
Describe the bug
Trying to use BLEURT in offline mode fails. The script and model weights are cached to disk fine (when in online mode). In offline mode, it loads the script from the cache fine, but when trying to load the cached model weights, it throws an error.
I looks like the bug exists somewhere in the
get_from_cache
function, as the error is thrown from here:https://github.com/huggingface/datasets/blob/f96547708a889c09ca8a02ed7aadd8c5690503c5/src/datasets/utils/file_utils.py#L530
Steps to reproduce the bug
Steps to reproduce the behaviour:
Gives the following error:
Expected behavior
I would expect that, after loading the metric as
bleurt = load_metric("bleurt")
with an internet connection it will be cached locally, and I should be able to load it from this cache without an internet connection afterwards. I also considered manually specifying the cached model filepath like so:but this doesn't work either:
as the metric loading scripts expect the model checkpoint to be one of:
https://github.com/huggingface/datasets/blob/f96547708a889c09ca8a02ed7aadd8c5690503c5/metrics/bleurt/bleurt.py#L64-L75
Environment info
I installed datasets from main with
pip install git+https://github.com/huggingface/datasets.git