This allows for parallel inference and I am preparing to support sequence mode using a method similar to this (creating a new kind of graph)
Yes it should be perfectly safe to mix multiple ggml contexts like this, I looked into the code and there is no requirement that graphs only have to be created from the current context :3
This allows for parallel inference and I am preparing to support sequence mode using a method similar to this (creating a new kind of graph)
Yes it should be perfectly safe to mix multiple ggml contexts like this, I looked into the code and there is no requirement that graphs only have to be created from the current context :3