-
**Describe the bug**
When using the Electra model for POS for sequences longer than a specific number of tokens (varies with language), the forward method of the Electra model throws an incompatible …
-
Running predict on a model containing an Attention layer causes a RuntimeError due to a dimension issue.
- Keras 3.6.0 (issue occurs with 3.5.0 too)
- Backend is Torch with GPU support (2.5.1+cu12…
-
### 🐛 Describe the bug
```python
import torch
from torch import nn, Tensor
from torch.export import export_for_inference, Dim
from torch.nn.attention.flex_attention import flex_attention
class…
-
Thank you for developing this!
## Context
Due to lenghty computation time and in order to speed things up, I thought about using the `flash_attention_2` and smaller floating points `torch.float16`…
-
### 🐛 Describe the bug
error: https://gist.github.com/xmfan/7374fab55bdf73ba2501de15dd9de709
```
ValueError: The following `model_kwargs` are not used by the model: ['bos_token_id', 'pad_token_id…
-
开发机:ubuntu 20.04 mnn 3.0.0
模型 huggingface:Qwen2.5-0.5B-Instruct 和 Qwen2.5-0.5B-Instruct-GPTQ-Int8
## 导出 onnx 模型
$ python mnn/transformers/llm/export/llmexport.py --path pretrained_model/Qwen2.5…
-
### Your current environment
```text
The output of `python collect_env.py`
```
### How would you like to use vllm
I need to extend the context length of gemma2-9b model along also with other mo…
-
Hi,
First thank your very much for your work. It adds a huge improvement to DETR family.
And your paper was really well explained and written.
Also thank you for publishing your code & models, i…
-
Because I want to use the individual view models in isolation, I'm trying to build a pipeline that processes SMILES molecules into embeddings through the three view models. However, running the `model…
-
Some weights of the model checkpoint were not used when initializing CLIPTextModel:
['text_model.embeddings.position_ids']
Loading pipeline components...: 100%|█████████████████████████████████████…