Closed Smana closed 5 months ago
Maybe that would be better to use this volume in read only. So I would just need to make them available in the bucket before starting the process? Could you please guide me in identifying the procedure to provision the s3 bucket?
Thanks :)
Ok I managed to do what I want:
cloning the model
git clone https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2
sync the model in s3
aws s3 sync Mistral-7B-Instruct-v0.2 s3://<bucket_name>/Mistral-7B-Instruct-v0.2
use it under the pod
text-generation-launcher --model-id=/data/Mistral-7B-Instruct-v0.2 --quantize bitsandbytes-nf4
2024-04-26T12:57:37.746280Z INFO text_generation_launcher: Args { model_id: "/data/Mistral-7B-Instruct-v0.2", revision: None, validation_workers: 2, sharded: None, num_shard: None, quantize: Some(BitsandbytesNF4), speculate: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_top_n_tokens: 5, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, max_batch_size: None, enable_cuda_graphs: false, hostname: "text-generation-inference-58d9869995-gxzx2", port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, tokenizer_config_path: None, disable_grammar_support: false, env: false }
2024-04-26T12:57:37.746720Z INFO download: text_generation_launcher: Starting download process.
2024-04-26T12:57:48.114689Z INFO text_generation_launcher: Files are already present on the host. Skipping download.
2024-04-26T12:57:50.144159Z INFO download: text_generation_launcher: Successfully downloaded weights. 2024-04-26T12:57:50.144763Z INFO shard-manager: text_generation_launcher: Starting shard rank=0 2024-04-26T12:58:00.242683Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0 2024-04-26T12:58:02.873865Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output: rank=0 2024-04-26T12:58:02.873894Z ERROR shard-manager: text_generation_launcher: Shard process was signaled to shutdown with signal 9 rank=0 2024-04-26T12:58:02.944252Z ERROR text_generation_launcher: Shard 0 failed to start 2024-04-26T12:58:02.944282Z INFO text_generation_launcher: Shutting down shards Error: ShardCannotStart
I have another error that might not be related. I'm gonna solve that before closing this issue
Ok my first issue was caused by insufficient memory allocation. Now I got this error
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge
Well I managed to download the model using the recommanded way with huggingface-cli
huggingface-cli download mistralai/Mistral-7B-Instruct-v0.2
aws s3 sync /home/smana/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2 s3://<bucket>/models--mistralai--Mistral-7B-Instruct-v0.2
When the pod starts I still have permissions errors :/
text-generation-launcher --model-id=mistralai/Mistral-7B-Instruct-v0.2 --quantize bitsandbytes-nf4
...
2024-04-26T15:37:48.725974Z INFO text_generation_launcher: Files are already present on the host. Skipping download.
...
PermissionError: [Errno 1] Operation not permitted: '/data/models--mistralai--Mistral-7B-Instruct-v0.2/tmp_7e2fd113-2af9-4a1a-bf0e-22d328d4bc8b'
It is working much better with an EFS storage, but I let this issue open in case someone is able to find out a solution for the S3 mountpoint.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
System Info
That's strange because it seems that the
root
user is allowed to do anything. I tried to create, delete files. The unique thing it can't do right now is to change existing permissions.First question, does the application run using another user? It doesn't seem to. Do yo see any reasons for this behavior?
Expected behavior
Downloading and running the application