Closed regexboi closed 5 months ago
Could you try larger --shm-size
when you launch the docker? Like --shm-size 25g
.
Amazing, that worked thank you so much! I used --shm-size 25g as recommended above and it worked.
@regexboi could you share the checkpoint command you used?
@regexboi could you share the checkpoint command you used?
Sure, here are all the commands I used, these are the final working ones:
python3 convert_checkpoint.py --model_dir /app/model/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/ --output_dir /app/model/mistral-trt --dtype float16 --load_by_shard
trtllm-build --checkpoint_dir /app/model/mistral-trt --output_dir /app/model/mistral-trt-engine --gpt_attention_plugin float16 --gemm_plugin float16 --max_input_len 32256
System Info
Trying to build llama3-8b-instruct and mistral-instruct-0.2, both are resulting in OOM errors, but looking at the memory called for it seems too large:
llama
Mistral
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
Successful build
actual behavior
Out of memory error
additional notes
Could docker be introducing the issues? Or perhaps my attempt to use all default build options, I looked through the docs and it seemed like the defaults would be the best other than --max_input_len but I tried setting that to 4096 for llama and it didnt change the memory allocation at all.