日志信息为:
INFO: Started server process [3559242]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:6006 (Press CTRL+C to quit)
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:128000 for open-end generation.
INFO: 2.0.1.1:59967 - "POST / HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
...
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:03<00:00, 1.21it/s]
INFO: Started server process [3610169]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:6006 (Press CTRL+C to quit)
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:128000 for open-end generation.
[2024-05-21 15:30:38] ", prompt:"你好", response:"'😊 你好!我是 Chatbot,很高兴和你交流!有什么想聊的主题或问题?
日志信息为: INFO: Started server process [3559242] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:6006 (Press CTRL+C to quit) The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's
attention_mask
to obtain reliable results. Settingpad_token_id
toeos_token_id
:128000 for open-end generation. INFO: 2.0.1.1:59967 - "POST / HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI application ... RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!七号卡上已经起了一个模型,再去起llama3 显示张量设备异常, 修改一下cuda设备的参数设置方式
清理GPU内存函数
def torch_gc(): if torch.cuda.is_available(): # 检查是否可用CUDA
with torch.cuda.device(CUDA_DEVICE): # 指定CUDA设备
部署时 CUDA_VISIBLE_DEVICES=7 python3 fast_api.py
可以正常调用
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:03<00:00, 1.21it/s] INFO: Started server process [3610169] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:6006 (Press CTRL+C to quit) The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's
attention_mask
to obtain reliable results. Settingpad_token_id
toeos_token_id
:128000 for open-end generation. [2024-05-21 15:30:38] ", prompt:"你好", response:"'😊 你好!我是 Chatbot,很高兴和你交流!有什么想聊的主题或问题?