-
**Dynamicly quantized AlBert model shows poor performance**
When I quantize fine-tuned albert-base-v2 model F1 score drops to 0.05 whereas on non quantized version it is 0.8.
**System informat…
-
To reproduce
```python
from torchao import quantize_
from torchao.quantization import int8_weight_only
from torch import nn
import torch
linear = nn.Linear(1024, 1024)
quantize_(linear, int…
-
Hi david
about "Quantization aware training " ,do you have schedules
i have the need,i want to do the job ,give some suggestions,thanks
-
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported
trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
train_da…
-
Hello,
Along the issue here https://github.com/evo-design/evo/issues/11 which discusses finetuning codes for Evo, I am specifically looking for information on which frameworks could be used to opti…
-
Investigate opportunities and costs of lossy packing and lossless compression of data.
* Packing: reducing the precision of the data to quantity-specific levels of precision
* lossy - by defin…
-
**System information**
- TensorFlow version (you are using): 2.6.0 (TFMOT 0.7.2)
- Are you willing to contribute it (Yes/No): Potentially, with some advice on how to implement it
**Motivation**…
hunse updated
2 years ago
-
I had trained 8 epochs and I got the last .pt file finally
refer to this documentation: [Llama3 in torchtune](https://pytorch.org/torchtune/stable/tutorials/llama3.html#)
I have succeded to Eva…
-
When building **AWQ_int4_group_128** engine, the code [here](https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/models/llama/weight.py#L1362) seems to convert the weights which are already …
-
### System Info
```shell
optimum: 1.5.2
python: Python 3.8.10
docker image: nvcr.io/nvidia/tensorrt:22.07-py3
```
### Who can help?
_No response_
### Information
- [X] The official example scr…