linear-transformer Search Results

1000+ results
for linear-transformer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/optimum-quanto #285

This is not allowed since there's already a kernel registere…

I am trying to run this code ```python from optimum.quanto import quantize, freeze, qint8 import torch import torch.nn as nn class Model(nn.Module): def __init__(self): super().__…

arseniybelkov updated 2 weeks ago
4
VainF/Torch-Pruning #340

deit_small_patch16_224 pruning failed for older version of t…

hello @VainF, Can you please check the following? Thanks. deit_small_patch16_224 pruning throws the following error: ```ruby File "/home/viplab/Deepak/Torch_Pruning_v137/examples/transf…

ghimiredhikura updated 6 months ago
2
llvm/torch-mlir #3649

Issue with lowering GPT2 to linalgIR

I am trying to lower GPT2 model to linalgIR but I am running into errors. I have built torch_mlir from source and have installed transformers with the latest version: pip install git+https://github.co…

sdalvi-quic updated 1 week ago
4
OpenNMT/CTranslate2 #1496

bug: alibi + multi_query_attention crash

```python try: import transformers except ImportError: pass from ctranslate2.specs import ( transformer_spec, ) from ctranslate2.converters.transformers import TransformersConver…

nlpcat updated 9 months ago
2
ridgerchu/matmulfreellm #21

Why Transformer++

I found this project being discussed in local llama subreddit. I read the paper but had questions. One of the questions that came up that is gnawing at me... Why Transformer++ as your basis of co…

sdmorrey updated 2 months ago
1
Beomi/InfiniTransformer #11

Model generating random sequence

By saving the model and reloading it I managed to get the model working, both with quantized and full precision (it still uses 10gb max of gpu ram). However, the model generates random characters. He…

Lazy3valuation updated 4 months ago
8
zhongkaifu/Seq2SeqSharp #88

Add training and inference support for RWKV LSTMs

**Is your feature request related to a problem? Please describe.** Your Seq2SeqSharp project already support LSTMs. Please consider to implement the RWKV large language "linear attention" idea into y…

TodayAI updated 2 months ago
1
AkihikoWatanabe/paper_notes #1311

GLU Variants Improve Transformer, Noam Shazeer, N/A, arXiv'2…

# URL - https://arxiv.org/pdf/2002.05202 # Affiliations - Noam Shazeer, N/A # Abstract - Gated Linear Units (arXiv:1612.08083) consist of the component-wise productof two linear projections, one o…

AkihikoWatanabe updated 3 months ago
1
lightyear-turing/TuringMM-34B-Chat #2

'TuringMMConfig' object has no attribute 'mlp_bias'. Did you…

self.tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False,trust_remote_code=True) self.model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto",torch_dtype=t…

Wmp0720 updated 2 months ago
1
InternLM/xtuner #834

llava预训练报错RuntimeError: The size of tensor a (0) must match …

### 复现方式 `xtuner train llava_internlm2_chat_20b_clip_vit_large_p14_336_e1_gpu8_pretrain.py` ### 配置文件仅改动数据集及模型位置 ### 运行日志 ``` Map (num_proc=32): 100%|████████████████████████████████████████…

rfvscj updated 1 month ago
3

上一页 1...13 14 15 16 17 18 19...100 下一页

1000+ results for linear-transformer

1000+ results
for linear-transformer