Open ngxson opened 3 months ago
Have you seen:
gguf-dump for printing metadata ?
Or do you want something dynamic during the forward ?
Yes I tries gguf-py but it does not have access to quantized types
This could be quite fun. The web page can also generate a set of useful llama.cpp
commands for that specific model (e.g. run main
, server
, etc) that can be copy-pasted for convenience.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue was closed because it has been inactive for 14 days since being marked as stale.
@ngxson reopen? Also, I'd like to suggest similar functionality for imatrices. Or should I open a parallel FR?
Motivation
With the recent introduction of
eval-callback
example, we now having more tools for debugging when working with llama.cpp. However, one of the tool that I feel missing is the ability to dump everything inside a gguf file into a human-readable (and interactive) interface.Inspired from
huggingface.js
where users can visualize the KV and list of tensors on huggingface.com, I would like to implement the same thing in llama.cpp. I find this helpful in these situations:convert.py
script when adding a new architectureeval-callback
)The reason why I can't use
huggingface.js
is because it's based on browser, which make it tricky when reading a huge local file. It also don't have access to quantized types (same forgguf-py
).Possible Implementation
Ideally, I want the implementation to be a binary named
gguf-viewer
that when run, will open a web page inlocalhost:8080
. User can then go to the web page to explore the gguf file. It will have these sections: