Closed dltn closed 1 month ago
Before
+---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Model Descriptor | HuggingFace Repo | Context Length | Hardware Requirements | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | meta-llama/Llama-2-7b | meta-llama/Llama-2-7b | 4K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | meta-llama/Llama-2-13b | meta-llama/Llama-2-13b | 4K | 1 GPU, each >= 28GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | meta-llama/Llama-2-70b | meta-llama/Llama-2-70b | 4K | 8 GPUs, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | meta-llama/Meta-Llama-3-8B | meta-llama/Meta-Llama-3-8B | 8K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | meta-llama/Meta-Llama-3-70B | meta-llama/Meta-Llama-3-70B | 8K | 8 GPUs, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-8B | meta-llama/Meta-Llama-3.1-8B | 128K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-70B | meta-llama/Meta-Llama-3.1-70B | 128K | 8 GPUs, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B:bf16-mp8 | | 128K | 8 GPUs, each >= 120GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B | meta-llama/Meta-Llama-3.1-405B-FP8 | 128K | 8 GPUs, each >= 70GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B:bf16-mp16 | meta-llama/Meta-Llama-3.1-405B | 128K | 16 GPUs, each >= 70GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | meta-llama/Llama-2-7b-chat | meta-llama/Llama-2-7b-chat | 4K | 1 GPU, each >= 14GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | meta-llama/Llama-2-13b-chat | meta-llama/Llama-2-13b-chat | 4K | 1 GPU, each >= 28GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | meta-llama/Llama-2-70b-chat | meta-llama/Llama-2-70b-chat | 4K | 3 GPUs, each >= 48GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | meta-llama/Meta-Llama-3-8B-Instruct | meta-llama/Meta-Llama-3-8B-Instruct | 8K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | meta-llama/Meta-Llama-3-70B-Instruct | meta-llama/Meta-Llama-3-70B-Instruct | 8K | 3 GPUs, each >= 48GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-8B-Instruct | meta-llama/Meta-Llama-3.1-8B-Instruct | 128K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-70B-Instruct | meta-llama/Meta-Llama-3.1-70B-Instruct | 128K | 8 GPUs, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B-Instruct:bf16-mp8 | | 128K | 8 GPUs, each >= 120GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B-Instruct | meta-llama/Meta-Llama-3.1-405B-Instruct-FP8 | 128K | 8 GPUs, each >= 70GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B-Instruct:bf16-mp16 | meta-llama/Meta-Llama-3.1-405B-Instruct | 128K | 16 GPUs, each >= 70GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-Guard-3-8B | meta-llama/Llama-Guard-3-8B | 128K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-Guard-3-8B:int8-mp1 | meta-llama/Llama-Guard-3-8B-INT8 | 128K | 1 GPU, each >= 10GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Prompt-Guard-86M | meta-llama/Prompt-Guard-86M | 128K | 1 GPU, each >= 1GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+
After
+---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Model Descriptor | HuggingFace Repo | Context Length | Hardware Requirements | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-2-7b | meta-llama/Llama-2-7b | 4K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-2-13b | meta-llama/Llama-2-13b | 4K | 1 GPU, each >= 28GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-2-70b | meta-llama/Llama-2-70b | 4K | 3 GPUs, each >= 48GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-2-7b-chat | meta-llama/Llama-2-7b-chat | 4K | 1 GPU, each >= 14GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-2-13b-chat | meta-llama/Llama-2-13b-chat | 4K | 1 GPU, each >= 28GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-2-70b-chat | meta-llama/Llama-2-70b-chat | 4K | 3 GPUs, each >= 48GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-3-8B | meta-llama/Meta-Llama-3-8B | 8K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-3-70B | meta-llama/Meta-Llama-3-70B | 8K | 8 GPUs, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-3-8B-Instruct | meta-llama/Meta-Llama-3-8B-Instruct | 8K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-3-70B-Instruct | meta-llama/Meta-Llama-3-70B-Instruct | 8K | 3 GPUs, each >= 48GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-8B | meta-llama/Meta-Llama-3.1-8B | 128K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-70B | meta-llama/Meta-Llama-3.1-70B | 128K | 8 GPUs, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B:bf16-mp8 | | 128K | 8 GPUs, each >= 120GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B | meta-llama/Meta-Llama-3.1-405B-FP8 | 128K | 8 GPUs, each >= 70GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B:bf16-mp16 | meta-llama/Meta-Llama-3.1-405B | 128K | 16 GPUs, each >= 70GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-8B-Instruct | meta-llama/Meta-Llama-3.1-8B-Instruct | 128K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-70B-Instruct | meta-llama/Meta-Llama-3.1-70B-Instruct | 128K | 8 GPUs, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B-Instruct:bf16-mp8 | | 128K | 8 GPUs, each >= 120GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B-Instruct | meta-llama/Meta-Llama-3.1-405B-Instruct-FP8 | 128K | 8 GPUs, each >= 70GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Meta-Llama3.1-405B-Instruct:bf16-mp16 | meta-llama/Meta-Llama-3.1-405B-Instruct | 128K | 16 GPUs, each >= 70GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-Guard-3-8B | meta-llama/Llama-Guard-3-8B | 128K | 1 GPU, each >= 20GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Llama-Guard-3-8B:int8-mp1 | meta-llama/Llama-Guard-3-8B-INT8 | 128K | 1 GPU, each >= 10GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+ | Prompt-Guard-86M | meta-llama/Prompt-Guard-86M | 128K | 1 GPU, each >= 1GB VRAM | +---------------------------------------+---------------------------------------------+----------------+----------------------------+
Before
After