-
I keep getting this error even though I'm trying to convert a model that should have the proper context size...not sure what else to do:
The model I'm trying to convert is ```Yi-1.5-9B-Chat```
`…
-
Loading a vicuna13B using 4bit quantization from the transformers library is possible [load_in_4bit](https://huggingface.co/docs/transformers/main_classes/quantization). How difficult could be for Fas…
-
## 🚀 Feature
Interpreting the BART model, basically obtaining the same as we can get using BERT. Especially interested in word attributions and visualization for sentence classification.
## Motiv…
-
### 🐛 Describe the bug
I'm trying to follow the instructions to efficiently load Hugging Face models from [`torchtitan`'s docs for FSDP1 -> FSDP2: Meta-Device Initialization](https://github.com/pyt…
-
[GGUF](https://huggingface.co/docs/hub/en/gguf) is becoming a preferred means of distribution of FLUX fine-tunes.
Transformers recently added general support for GGUF and are slowly adding support …
-
### Feature request
I would like to request [llama.cpp](https://github.com/ggerganov/llama.cpp) as a new model backend in the transformers library.
### Motivation
llama.cpp offers:
1) Exce…
-
### Introduction
A large language model fine-tuned to fluently speak and understand native Pidgin English for natural communication across Africa.
### Description
Turaco is the first LLM developed …
-
Hi and thanks for this great library!
Iam very new to onnx and Iam trying to include the Roberta tokenizer into a Roberta onnx model.
As far as I have understood, one can get the onnx graph for th…
-
Excuse me, can the training process only be implemented through mlora? But that doesn't match my own torch and transformer versions, is there a solution?
-
Great job! Would it be possible to upstream your transformer changes (or at least provide a diff of the changes)? Long term it isn't sustainable to run off of a transformers fork. If you could provide…