SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
2.23k
stars
257
forks
source link
Add read permission token per security requirement #1942
Closed
thuang6 closed 4 months ago
Add top level permission: read-all