Closed everettVT closed 3 months ago
Nevermind I found your GPU specs in the paper. Thanks again
Action generator. We use the provided hyperparameter configuration and fine-tune Llama-2-7B and 13B across 2 NVIDIA A100 80GB GPUs, and Llama-3-8B across 4 NVIDIA A100 80GB GPUs. Code generator. We use the provided hyperparameter configuration and fine-tune CodeTulu-7B and DeepSeekCoder-7B-Instruct-v1.5 across 2 NVIDIA A100 80GB GPUs, and Llama-3-8B across 4 NVIDIA A100 80GB GPUs. Math reasoner. We use the provided hyperparameter configuration and fine-tune Tulu-2-7B and DeepSeekMath-7B-Instruct across 2 NVIDIA A100 80GB GPUs, and Llama-3-8B across 4 NVIDIA A100 80GB GPUs. Query generator. We use the provided hyperparameter configuration and fine-tune Llama-2-7B across 2 NVIDIA A100 80GB GPUs, and Llama-3-8B across 4 NVIDIA A100 80GB GPUs.
It would seem to me that Husky would be a perfect use case for my Triton Inference Server for its model management and concurrency. I am building a personal knowledge agent and would like to benchmark Husky against proprietary models for tool use and reasoning. Do you have recommendations for compute for my use case?
@danieljkim0118 PS: Excellent work. Its great to see Meta, UW, and AI2 working together. As a fellow Seattle area native its great to see this kind of development.