runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
213 stars 82 forks source link

trust_remote_code not recognized #43

Closed dannysemi closed 6 months ago

dannysemi commented 6 months ago
2024-02-08T08:40:56.016480879Z engine.py           :43   2024-02-08 08:40:56,015 vLLM config: {'model': 'TheBloke/Nous-Capybara-34B-AWQ', 'download_dir': '/runpod-volume/huggingface-cache/hub', 'quantization': 'awq', 'load_format': 'auto', 'dtype': 'half', 'tokenizer': None, 'disable_log_stats': True, 'disable_log_requests': True, 'trust_remote_code': True, 'gpu_memory_utilization': 0.95, 'max_parallel_loading_workers': 48, 'max_model_len': 32000, 'tensor_parallel_size': 1}
2024-02-08T08:40:56.161618385Z Traceback (most recent call last):
2024-02-08T08:40:56.161651438Z   File "/usr/local/lib/python3.10/dist-packages/transformers/dynamic_module_utils.py", line 598, in resolve_trust_remote_code
2024-02-08T08:40:56.161703147Z The repository for TheBloke/Nous-Capybara-34B-AWQ contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/TheBloke/Nous-Capybara-34B-AWQ.
2024-02-08T08:40:56.161725757Z You can avoid this prompt in future by passing the argument `trust_remote_code=True`.
2024-02-08T08:40:56.161730573Z 
2024-02-08T08:40:56.162176340Z     answer = input(
2024-02-08T08:40:56.162202300Z EOFError: EOF when reading a line
2024-02-08T08:40:56.162207723Z 
2024-02-08T08:40:56.162212176Z During handling of the above exception, another exception occurred:
2024-02-08T08:40:56.162219295Z 
2024-02-08T08:40:56.162223415Z Traceback (most recent call last):
2024-02-08T08:40:56.162227640Z   File "/src/handler.py", line 5, in <module>
2024-02-08T08:40:56.162232250Z     vllm_engine = vLLMEngine()
2024-02-08T08:40:56.162237292Z   File "/src/engine.py", line 44, in __init__
2024-02-08T08:40:56.162241842Z     self.tokenizer = Tokenizer(os.getenv("TOKENIZER_NAME", os.getenv("MODEL_NAME")))
2024-02-08T08:40:56.162246872Z   File "/src/engine.py", line 17, in __init__
2024-02-08T08:40:56.162251149Z     self.tokenizer = AutoTokenizer.from_pretrained(model_name)
2024-02-08T08:40:56.162256589Z   File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 788, in from_pretrained
2024-02-08T08:40:56.162569437Z     trust_remote_code = resolve_trust_remote_code(
2024-02-08T08:40:56.162595802Z   File "/usr/local/lib/python3.10/dist-packages/transformers/dynamic_module_utils.py", line 611, in resolve_trust_remote_code
2024-02-08T08:40:56.162620665Z     raise ValueError(
2024-02-08T08:40:56.162626086Z ValueError: The repository for TheBloke/Nous-Capybara-34B-AWQ contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/TheBloke/Nous-Capybara-34B-AWQ.
2024-02-08T08:40:56.162635356Z Please pass the argument `trust_remote_code=True` to allow custom code to be run.
2024-02-08T08:40:56.903281462Z Do you wish to run the custom code? [y/N] 

The first line clearly shows trust_remote_code=True in my engine args. I passed it to the worker as an environment variable on the template. TRUST_REMOTE_CODE set to 1.

alpayariyak commented 6 months ago

Just fixed, thanks for pointing this out! You may use runpod/worker-vllm:dev if you rely on the pre-built image.