I made a llamafile with Phi-3.5-mini-instruct-Q4_K_L.gguf from HF but it runs out of memory with the default context window of 128K and limiting the context window to 2K runs into another failure. This is on Windows 11 with AMD Radeon 890M iGPU.
Version
llamafile v0.8.13
What operating system are you seeing the problem on?
Contact Details
What happened?
I made a llamafile with Phi-3.5-mini-instruct-Q4_K_L.gguf from HF but it runs out of memory with the default context window of 128K and limiting the context window to 2K runs into another failure. This is on Windows 11 with AMD Radeon 890M iGPU.
Version
llamafile v0.8.13
What operating system are you seeing the problem on?
Windows
Relevant log output