-
Hi, @Haiyang-W! You have done a very interesting work. However, I encounter a problem when calculating the FLOPs of the GiT model. When I run `python tools/analysis_tools/get_flops.py`, it output 0 F…
-
Issue during saving `unsloth/mistral-7b-instruct-v0.3-bnb-4bit` after training/saving, both in Kaggle and [gguf-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo)
I have tried converting…
-
Trying to run image encoder model, I followed `extract_feature.sh` example: loading pretrained pth checkpoint `sapiens_1b_epoch_173_clean.pth` and config `pretrain/configs/sapiens_mae/humans_300m_test…
-
Unsloth: You have 1 CPUs. Using `safe_serialization` is 10x slower.
We shall switch to Pytorch saving, which will take 3 minutes and not 30 minutes.
To force `safe_serialization`, set it to `None` i…
-
I am using google colab. Downloaded "Llama3-8B-1.58-100B-tokens" model.
but when I run : !python utils/convert-hf-to-gguf-bitnet.py models/Llama3-8B-1.58-100B-tokens --outtype f32
initially it star…
-
### What happened?
Hi,
When I use llama.cpp to deploy a pruned llama3.1-8b model, a unbearable performance degration appears:
We useing a structed pruning method(LLM-Pruner) to prune llama3.1-8b, w…
-
Hi, I am interested in your work. But I have a question about your medium_model.py; it seems that in your SpecformerMedium class, you didn't apply
```
mha_eig = self.mha_norm(eig)
…
-
As soon as everyone migrated to nFONLL we should remove all the branching with aFONLL, and just support a single FONLL implementation.
E.g. the following should only keep the `if` branch
https://g…
-
### System Info
- `transformers` version: 4.44.2
- Platform: macOS-15.1-arm64-arm-64bit
- Python version: 3.10.14
- Huggingface_hub version: 0.23.3
- Safetensors version: 0.4.3
- Accelerate vers…
-
In [Ludicrously Fast Neural Machine Translation](https://aclanthology.org/D19-5632.pdf), they test a variety of decoder configurations for faster models.
In #174 @eu9ene showed that a larger de…