Open oldgithubman opened 3 weeks ago
This should actually be high severity
Memory limits(rpc-server --mem) are not working!!
Memory limits(rpc-server --mem) are not working!!
? I know? That's what I'm saying?
There is a problem where all memory is used even if --mem is specified.
There is a problem where all memory is used even if --mem is specified.
Awesome. /s Thanks for telling me though
It loads only the number of layers set with --ngl, so it crashes due to a buffer overflow.
Ideally, it would be better to change the specification so that -ngl can be set individually on the RPC server side.
Ideally, it would be better to change the specification so that -ngl can be set individually on the RPC server side.
I think fixing --mem would be better. Remote servers should be as hands-off as possible and -ngl should ideally become a --mem -type option as well. Would make way more sense than -ngl
q.v.
I also found the way the RPC server and client deals with specifying / limiting what memory on the CPU / GPU resources to be confusing and limited and so I, too, would like to see simple / clear means of limiting what memory (RAM/VRAM) is used on each node. IMO it'd also be nicer if the model data could be locally loaded vs. uploaded over the network to the RPC servers, too.
Bug: [RPC] RPC apparently isn't honoring backend memory capacity et. al. #8112
Feature Request: Provide means to quantify the restriction of RAM/VRAM usage for each GPU and system RAM. #8113
Feature Request: It would be convenient and faster if users could specify that the model data used for a RPC-server instance is already available by some fast(er) means (file system GGUF, whatever).
What happened?
I expected backend memory: $mem MB when I input --mem $mem
Name and Version
What operating system are you seeing the problem on?
Windows
Relevant log output