-
### Feature request
I want to add the ability to use GGUF BERT models in transformers.
Currently the library does not support this architecture. When I try to load it, I get an error TypeError: Ar…
-
A very nice feature would be able to download pre-trained models. To implement this the already existing `export` CLI command can be very nicely extended. The current API is
```
mtt export \
…
-
Thank you for the great work!
Why do you add shortcut to BN in RepBN? Similar to the explanation in RepVGG, is it to construct a multi-branch architecture to make the model an implicit ensemble of nu…
-
The wrong architecture for a model shouldn't cause a hang or segfault or error, it should just be slow.
-
This issue is to track how to get German working and what options one need to consider.
# dotnet-examples
https://github.com/k2-fsa/sherpa-onnx/tree/master/dotnet-examples
- [ ] keyword-spotting-…
-
Torch has support for float8 matmul kernels, and it seems like they are faster than bf16 on Ada and above architectures. TorchAO supports training in fp8. This has been explored in a few newer optimiz…
-
### Issue Title:
**Implement Image Transformation using Cycle GAN**
### Issue Description:
In this task, the goal is to transform images into different forms using **Cycle GAN**, a type of Ge…
-
I have been using unsloth daily for quite sometime now , tried every architecture and there seems to be a problem with gemma based models when saving to hub.
The tokenizer.model file never seems to g…
-
In the `get_candidates` function of the `zerocost` branch optimizers `Bananas` and `Npenas` there is a discrepancy between how candidates in the `next_batch` are stored.
If the acquisition function…
-
Recently, I had the pleasure of reading your paper LeLaN, and I must say it is an impressive piece of work. However, I have a few questions that I would appreciate your clarification on:
1. In Figure…