deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.16k stars 661 forks source link

Failed to deploy Mistral 8X7b with the latest LMI djl-tensorrtllm container on Sagemaker G5.48xlarge #3445

Open gsjoy8888 opened 2 months ago

gsjoy8888 commented 2 months ago

Description

image_uri = image_uris.retrieve(
        framework="djl-tensorrtllm",
        region=sess.boto_session.region_name,
        version="0.29.0"
    )
model = sagemaker.Model(
    image_uri=image_uri, 
    role=role,
    # specify all environment variable configs in this map
    env={
        "HF_MODEL_ID": "mistralai/Mixtral-8x7B-Instruct-v0.1",
        "TENSOR_PARALLEL_DEGREE": "max",
        "OPTION_MAX_NUM_TOKENS": "8192",
        "OPTION_QUANTIZE": "awq",
        "HF_TOKEN": "hf_xNBRqleBjkvQPnDqFxxxxxxxxxxxxxxxxx",
    }
)

Expected Behavior

(what's the expected behavior?)

Error Message

error_log.txt

logs attached