-
# Description:
Hello! I appreciate the excellent work on benchmarking Performer and Longformer against the base Transformer. I’d like to propose the implementation of additional efficient Transformer…
-
### Describe the bug
When specifying `num_layers` in `SD3ControlNetModel.from_transformer`, the function results in an error of unexpected key(s). I think this is because that in `SD3ControlNetModel`…
-
A full explanation of and a minimal case for reproducing the problem can be found here: https://github.com/Lichborne/IdrisLinearityIssue/blob/main/ErrorExamples.idr
Depends only on Data.Linear.LVect…
-
*Note*: If you have a model or program that is not supported yet but should be, please use the program coverage template.
## 🐛 Bug
Let's take stablecode-completion-alpha-3b whose sequence le…
-
### 🚀 The feature, motivation and pitch
I want to utilize the liger-kernel fused operations on a codebase but do not need the requirement for transformers. However, when I import the liger_kernel.tra…
-
The following line of code in notebook I believe is incorrect:
`transformer_input_expanded = model.transformer[0].linear[0](transformer_input)[0]`
This is taking the hidden state of the MLP ('li…
-
Hello, in your paper “**Hypformer: Exploring Efficient Hyperbolic Transformer Fully in Hyperbolic Space**,” you mentioned that you implemented operations such as the **linear transformation layer, Lay…
-
### Feature request
`bias` of linear layers in `qwen2` model is hard coded as following:
- https://github.com/huggingface/transformers/blob/85345bb439652d3f03bb4e123cef7a440f2ba95b/src/transformers/…
-
Some Flux loras are working, but others refuse to load properly (even though they work fine on comfyui). Around 50 % of the loras I've tested where failing.
There is no crash, but I get lots of `[W…
-
Hi neuralmagic team !
Very nice work with AutoFP8 ! We were thinking of integrating AutoFP8 in transformers, so that users can run your checkpoints directly with transformers ! We would simply rep…