Llama2 (and Llama-based models) timeout. Other chat models (tested Mistral, Mixtral) respond fine. Below is the snippet of the docker container log capturing when the request is sent from Refact extension (VS Code) and timeout received at the extension.
This was installed using :latest (note to self: never again use :latest). My attempt to find what version this is:
Llama2 (and Llama-based models) timeout. Other chat models (tested Mistral, Mixtral) respond fine. Below is the snippet of the docker container log capturing when the request is sent from Refact extension (VS Code) and timeout received at the extension.
This was installed using :latest (note to self: never again use :latest). My attempt to find what version this is: