Open kcchu opened 1 year ago
https://github.com/facebookresearch/llama/issues/79#issuecomment-1465779961
System:
Using the above methods on 3090 Ti 24GB;
LLaMA 13B - 30 seconds loading (with swap - 50GB), 30 seconds inference
https://github.com/facebookresearch/llama/issues/79#issuecomment-1465779961
System:
LLaMA 13B
LLaMA 7B