I’ve observed that on macOS that when the AI is replying it’s using active memory and when the AI is no longer replying it moves -all- of the active memory in use by koboldccp into swap. Then it moves it back to active memory on the next reply.
as you can imagine this is quickly going to wear out the internal storage.
using the latest release compiled on macOS with metal support and I am using -noblas and gpulayers.
I’ve observed that on macOS that when the AI is replying it’s using active memory and when the AI is no longer replying it moves -all- of the active memory in use by koboldccp into swap. Then it moves it back to active memory on the next reply.
as you can imagine this is quickly going to wear out the internal storage.
using the latest release compiled on macOS with metal support and I am using -noblas and gpulayers.