-
In colab, I made some test quantizing some A.I. models.
Once quantized, the files are usually between 4 and 7 gb.
The only thing that seems to work is to move them momentarily on google drive and th…
-
Hi, I am trying to apply torch2trt on [FairMot model](https://github.com/ifzhang/FairMOT). It has an external library DCNv2.
1)With option fp16_mode=True, DCNv2 cannot be converted correctly and me…
-
Thank you very much for your work,
I refer to your code modification yolov5,When w4a8 quantizing There are nearly 3points of loss,Have you experimented yolov5
-
hello. I am getting an error when running the sample below.
The request file does not exist in the original source,
I copied and used the preprocessor_config.json file in the same model family.
…
-
Thank you for your efforts.
I'm curious to know if there are any codes or scripts for quantizing my own 2-bit stable diffusion models, rather than relying on the pre-existing model available on Goog…
-
## I fine-tune my model on Mistral-7B-Instruct-v0.2 using QLora, then merge it back to the base mode (I need to use vLLM). But I always have Cuda out of memory even I use an instance that has 48GB CPU…
-
### Your current environment
(venv-vllm-54) (base) root@I1ba088648b009018e4:/hy-tmp# nvidia-smi
Tue Aug 6 10:29:16 2024
+--------------------------------------------------------------------…
-
# Christian Mills - Training Keypoint R-CNN Models with PyTorch
Learn how to train Keypoint R-CNN models on custom datasets with PyTorch.
[https://christianjmills.com/posts/pytorch-train-keypoint-rc…
-
Hey I'm using the MX datatypes. It seems like the aten.linear.default function has not been implemented which causes the linear layers in the attenion layers not work with the MX datatypes.
Can you…
-
Hey, i want to quantize my Qwen2 model but it seems then the files are not found even though it clones and installs llama.cpp correctly. When quantizing the mode i get this:
```txt
python3: can't …