Open Virtual1257 opened 11 months ago
我是用TGI加载本地模型CodeShell-7B-Chat,但是加载过程中报错,我使用的命令如下:
sudo docker run --gpus 'all' --shm-size 1g -p 9090:80 -v /home/CodeShell/WisdomShell:/data --env LOG_LEVEL="info,text_generation_router=debug" ghcr.nju.edu.cn/huggingface/text-generation-inference:1.0.3 --model-id /data/CodeShell-7B-Chat --num-shard 1 --max-total-tokens 5000 --max-input-length 4096 --max-stop-sequences 12 --trust-remote-code
输出及报错信息如下:
2023-10-24T01:47:14.674168Z INFO text_generation_launcher: Args { model_id: "/data/CodeShell-7B-Chat", revision: None, validation_workers: 2, sharded: None, num_shard: Some(1), quantize: None, dtype: None, trust_remote_code: true, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 12, max_top_n_tokens: 5, max_input_length: 4096, max_total_tokens: 5000, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, hostname: "e2df4ceac2dc", port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false } 2023-10-24T01:47:14.674233Z WARN text_generation_launcher: `trust_remote_code` is set. Trusting that model `/data/CodeShell-7B-Chat` do not contain malicious code. 2023-10-24T01:47:14.685067Z INFO download: text_generation_launcher: Starting download process. 2023-10-24T01:47:21.825629Z INFO text_generation_launcher: Files are already present on the host. Skipping download. 2023-10-24T01:47:23.136555Z INFO download: text_generation_launcher: Successfully downloaded weights. 2023-10-24T01:47:23.137089Z INFO shard-manager: text_generation_launcher: Starting shard rank=0 2023-10-24T01:47:30.969269Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output: rank=0 2023-10-24T01:47:30.969335Z ERROR shard-manager: text_generation_launcher: Shard process was signaled to shutdown with signal 4 rank=0 Error: ShardCannotStart 2023-10-24T01:47:31.066204Z ERROR text_generation_launcher: Shard 0 failed to start 2023-10-24T01:47:31.066262Z INFO text_generation_launcher: Shutting down shards Error: ShardCannotStart
我目前使用的环境如下:
显卡:nvidia v100 系统:ubuntu20.04 python版本:3.10 docker版本: 24.0.5
请问解决了吗?我也遇到同样问题。 第一次我启起来了,使用了一段时间,然后停掉了,今天再启的时候就启不起来了,内存都快占满了。 我看有的帖子说 --num-shard=1 的时候,会先加载进内存,然后再分发到 gpu,是这样子的吗?
我是用TGI加载本地模型CodeShell-7B-Chat,但是加载过程中报错,我使用的命令如下:
输出及报错信息如下:
我目前使用的环境如下: