issues
search
ggerganov
/
llama.cpp
LLM inference in C/C++
MIT License
60.95k
stars
8.7k
forks
source link
llama : NvAPI performance state change support
#8116
Open
sasha0552
opened
3 days ago
sasha0552
commented
3 days ago
Related: #8084
Reference implementation
TODO:
[x] Implement performance state switching functions
[ ] Place performance state switching calls in a common function before/after inference start/end
[ ] Switch only if Pascal GPU(s) present
[x] Compile only if
CUDA
enabled
[ ] Enable by default if CUDA enabled, otherwise disable
[ ] Log performance state changes and library loading status
[ ] Synchronize pstate changes between n instances of llama.cpp on a single GPU
[ ] Clean up temporary/debug code
[x] I have read the
contributing guidelines
Self-reported review complexity:
[ ] Low
[x] Medium
[ ] High
Related: #8084
Reference implementation
TODO:
CUDAenabled[ ] Clean up temporary/debug code