Open fake-name opened 5 months ago
Ok, I did some more experimentation:
durr@learner:~/vllm$ python3 -m vllm.entrypoints.openai.api_server --model "bartowski/Yi-34B-200K-RPMerge-exl2" --revision "resolve/6_5"
/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Traceback (most recent call last):
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
response.raise_for_status()
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/bartowski/Yi-34B-200K-RPMerge-exl2/resolve/resolve%2F6_5/config.json
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/transformers/utils/hub.py", line 399, in cached_file
resolved_file = hf_hub_download(
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1221, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1282, in _hf_hub_download_to_cache_dir
(url_to_download, etag, commit_hash, expected_size, head_call_error) = _get_metadata_or_catch_error(
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1722, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(url=url, proxies=proxies, timeout=etag_timeout, headers=headers)
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1645, in get_hf_file_metadata
r = _request_wrapper(
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 372, in _request_wrapper
response = _request_wrapper(
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 396, in _request_wrapper
hf_raise_for_status(response)
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 311, in hf_raise_for_status
raise RevisionNotFoundError(message, response) from e
huggingface_hub.utils._errors.RevisionNotFoundError: 404 Client Error. (Request ID: Root=1-6667fa9c-6e6925e966e7f2a14334fa98;0dda972e-b7af-4efb-af52-976433043987)
Revision Not Found for url: https://huggingface.co/bartowski/Yi-34B-200K-RPMerge-exl2/resolve/resolve%2F6_5/config.json.
Invalid rev id: resolve/6_5
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/durr/miniconda3/envs/venv/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/durr/miniconda3/envs/venv/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/vllm/entrypoints/openai/api_server.py", line 186, in <module>
engine = AsyncLLMEngine.from_engine_args(
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/vllm/engine/async_llm_engine.py", line 362, in from_engine_args
engine_config = engine_args.create_engine_config()
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/vllm/engine/arg_utils.py", line 559, in create_engine_config
model_config = ModelConfig(
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/vllm/config.py", line 129, in __init__
self.hf_config = get_config(self.model, trust_remote_code, revision,
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/vllm/transformers_utils/config.py", line 27, in get_config
config = AutoConfig.from_pretrained(
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 934, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/transformers/configuration_utils.py", line 632, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/transformers/configuration_utils.py", line 689, in _get_config_dict
resolved_config_file = cached_file(
File "/home/durr/miniconda3/envs/venv/lib/python3.9/site-packages/transformers/utils/hub.py", line 429, in cached_file
raise EnvironmentError(
OSError: resolve/6_5 is not a valid git identifier (branch name, tag name or commit id) that exists for this model name. Check the model page at 'https://huggingface.co/bartowski/Yi-34B-200K-RPMerge-exl2' for available revisions.
So it seems like the revision value is being used, but the current downloader assumes that a bunch of the files are also available on the main branch. It tries to fetch the config.json
from the branch specified in --revision
, but the tokenizer_config.json
from the main branch.
Ok, apparently you have to specify all the --*revision
flags. --revision
seems to only set the branch for the actual weights.
python3 -m vllm.entrypoints.openai.api_server \
--model "bartowski/Yi-34B-200K-RPMerge-exl2" \
--revision 6_5 \
--code-revision 6_5 \
--tokenizer-revision 6_5
I would have assumed that --revision
would set all the various *-revision
options. Maybe the unqualified --revision
should be renamed --weights-revision
or something.
I'd argue that --revision
should also set --code-revision
and --tokenizer-revision
unless they're also specified on the command line, though that might be something of a breaking change.
I'm on this,
What do you think @DarkLight1337?
Making --revision
to attend for all revisions? or just renaming it is sufficient?
I'm on this, What do you think @DarkLight1337? Making
--revision
to attend for all revisions? or just renaming it is sufficient?
Let's rename it first. Afterwards (in another PR) we can introduce a new CLI option to set the revision for all components.
Aight, Thanks!
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
Your current environment
How would you like to use vllm
I'm trying to use a specific branch of
bartowski/Yi-34B-200K-RPMerge-exl2
(https://huggingface.co/bartowski/Yi-34B-200K-RPMerge-exl2). Specifically this repo has no content in it's main branch, there are various quantizations in branches. For me, I want6_5
.The documentation says "
--revision
The specific model version to use. It can be a branch name, a tag name, or a commit id. If unspecified, will use the default version.". That sounds like it's how I can specify a specific branch, but it doesn't work:I've also tried sticking the branch name in
--code-revision
(because why not), it had no effect there either.Searching the existing issues for something like "huggingface branch" yields 15 pages. I went through the first 3 or so without much luck. This is kind of unfortunately a nearly unsearchable set of terms.