Open ggbetz opened 1 month ago
Check upon issue creation:
For XX in:
Parameters:
NEXT_MODEL_PATH=<org>/<model> NEXT_MODEL_REVISION=main NEXT_MODEL_PRECISION=float16 MAX_LENGTH=2048 GPU_MEMORY_UTILIZATION=0.8 VLLM_SWAP_SPACE=4
ToDos:
I'm getting some OOM issues here, which is really strange (8 H100 GPUs should suffice). I'll look more into this...
Check upon issue creation:
For XX in:
Parameters:
ToDos: