-
### System Info
GPU - A10
### Who can help?
@Tracin
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [X] An officially supported task in the `…
-
Currently some quantized huggingface models save zero-points in int4 datatype directly, like [Qwen/Qwen2-7B-Instruct-GPTQ-Int4](https://huggingface.co/Qwen/Qwen2-7B-Instruct-GPTQ-Int4) and [Qwen/Qwen2…
-
Weights only load failed. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution.Do it only if you get the file from a trusted so…
-
Active chatters are added to the pool, and more frequent chats increase a chatter's weight to be selected. It should favor people who are not typing only in emotes
-
https://github.com/intel/neural-compressor/blob/master/docs/source/quantization_weight_only.md#examples
how to set eval_func?
https://github.com/intel/neural-compressor/blob/master/examples/3…
-
hi, thanks for your excellent repo and i have much fun with trying it
i have a quesion about the attention control on the unconditional prediction only in AttentionControl (p2p_utils.py) according …
-
Please add option to adjust GPU Weight since my gpu only has 6GB Vram
my RTX 3060 laptop can run with normal fp8 within 100-150 sec,
but it talk super long with nf4 (my gpu run 99% all the time an…
-
As title. Basic gist, if you use this option as is, you may nuke your parent bones for things like skirts, ears, or breasts.
It would be nice to have a tickbox to prevent deletion of bones with chi…
-
### System Info
- GPU Type: V100
![WhatsApp Image 2024-03-05 at 9 50 36 AM](https://github.com/NVIDIA/TensorRT-LLM/assets/24196798/e9546886-695b-482b-96d4-1d4024935d7f)
### Who can help?
@Tracin…
-
Thank you for providing the pre-trained weights B13_rn18_moco_0099_ckpt.pth.
Could you please specify the exact code entry point used to generate these weights? The codebase includes multiple data…