predibase / lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
https://loraexchange.ai
Apache License 2.0
2.13k stars 139 forks source link

can't start my local llama3 model server with docker #511

Open cheney369 opened 3 months ago

cheney369 commented 3 months ago

System Info

with docker method.

Information

Tasks

Reproduction

I want to start lorax server with docker. And the shell script is :

#!/bin/bash
volume=$PWD/data

docker run --gpus all --shm-size 1g \
--name lorax -h lorax --net=host \
-v $volume:/data \
ghcr.io/predibase/lorax:latest \
--model-id llama3-awq-int4

I have llama3-awq-int4 model path in my local /data. And when I start the script, it connect the huggingface to download file, actuall, I have network issue, so it raise requests.exceptions.ConnectionError, but I wonder why it can load model with local?

And I just want to know, how can I restart with the local file?

Expected behavior

run local model with docker.

(base) ray30@ROG30:~/ai$ ./run_container.sh 
2024-06-12T07:06:54.217902Z  INFO lorax_launcher: Args { model_id: "llama3-awq-int4", adapter_id: None, source: "hub", default_adapter_source: None, adapter_source: "hub", revision: None, validation_workers: 2, sharded: None, num_shard: None, quantize: None, compile: false, speculative_tokens: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, max_active_adapters: 1024, adapter_cycle_time_s: 2, adapter_memory_fraction: 0.1, hostname: "lorax", port: 80, shard_uds_path: "/tmp/lorax-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, json_output: false, otlp_endpoint: None, cors_allow_origin: [], cors_allow_header: [], cors_expose_header: [], cors_allow_method: [], cors_allow_credentials: None, watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false, download_only: false }
2024-06-12T07:06:54.217963Z  INFO download: lorax_launcher: Starting download process.
2024-06-12T07:09:11.681682Z ERROR download: lorax_launcher: Download encountered an error: