-
I refer to the official quantitative demo training edgeai-yolov5, but the training did not converge.
Quantization training examples/quantization_example.py this demo can converge.
Refer to quantiz…
-
We are able to download the granite model using below command
ilab download --repository instructlab/granite-7b-lab-GGUF --release main --filename granite-7b-lab-Q4_K_M.gguf
ilab generate is worki…
-
Hi, I tried to speed up CNN inference using tensorRT, but the following two problems occurred. Have these issues been addressed?
SuperGlue stuck with warning:
```
[W:onnxruntime:SuperGlueOnnx, …
-
Hey Guys,
This is a great library, but I have a question. Is this library is able to use memory as efficiently as the Llama.cpp library? In otherwords, if I'm using a checkpoint that I use with Llama…
-
I noticed today that when I use python -m mlx_lm.generate the output doesn't match what I get locally using python lora.py. For example:
Local output using lora adapters:
```
(base) Williams-MacBo…
-
Hi, I have a question about the calibration data:
In [calib_data.py](https://github1s.com/mit-han-lab/llm-awq/blob/HEAD/awq/utils/calib_data.py), you re-organize the calib data so that every batch ha…
-
So, I need to implement stepped portamento as another glide mode. The simplest implementation is simply quantizing the pitch always to the nearest note and pitch changes, but I want to do a kind of "v…
-
Hi there! Thanks for this amazing library. I was able to run a 70B model on my M2 Macbook Pro!
I was able to get about one token every 100 seconds, which is almost good enough for my overnight task…
-
I am trying to run the code in the usage part of the README file.
`python imagenet.py --gpu 0,1,2,3 --data /home/bcrc/Datasets/imagenet --mode pre .......`
However, I encountered 'core dump' error f…
-
Hi, I'm using `BitsAndBytesConfig` on HF's Transformers library to quantize `facebook/opt-66B` model. But when I print the dtype of weights of varoius layers, all of them turn out to be of `int8`.
…