linear-transformer Search Results

1000+ results
for linear-transformer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/transformers #33525

Multi-GPU Training Object Detection

### System Info - `transformers` version: 4.45.0.dev0 - Platform: Linux-5.4.0-167-generic-x86_64-with-glibc2.35 - Python version: 3.10.14 - Huggingface_hub version: 0.24.5 - Safetensors version…

SangbumChoi updated 1 week ago
3
NVIDIA/TransformerEngine #972

no boost in performance with Ada GPU

On a simple transformer model, I am observing 10% training speed improvement with Hopper architecture GPUs but not with Ada ones. Baseline is bf16. I am using huggingface accelerate to handle every…

saurabh-kataria updated 3 weeks ago
1
sktime/sktime #4939

[ENH] Integration of LTSF-Linear Deep Learning Forecasters

This issue serves as an umbrella issue for integrating networks from LTSF-Linear. Deep learning has proven to be an effective way to predict time series data. To expand this type of forecasting in skt…

luca-miniati updated 3 months ago
11
sayakpaul/diffusers-torchao #32

How should a quantized + compiled model be serialized?

@jerryzh168 I think this could be beneficial to be able to load a quantized and compiled model and proceed straight to inference. However, I am not sure what functions to use to make this happen. …

sayakpaul updated 6 days ago
10
NVIDIA/Megatron-LM #1134

[BUG] 'NoneType' object has no attribute 'shape' error raise…

Hi, It seems that the same code is **working fine with when the Megatron-LM that I git-cloned in April. With the latest Megatron-LM, I've got the following error raised with the pretrain_gpt.py code. …

hwang2006 updated 6 hours ago
8
QwenLM/Qwen2.5 #898

[AutoAWQ] 请问微调并且量化后的Qwen2模型能否在CPU上运行呢？

将设备改为cpu后报错。 Traceback (most recent call last): File "/home/lmf/llm/Qwen1.5-4B-finetuning-accelerate/Q4B_sft_test.py", line 74, in get_answer('data/mutiple_task_test.json') File "/home/lm…

Autism-al updated 3 weeks ago
1
SSYSteve/GRATIS #2

Error in VCR Atention(Linear Attention or FullAttention)

in GRATIS/Graph Data/Graph Classification/nets/superpixels_graph_classification/gated_gcn_net.py line 40: `kfea = kfea.unsqueeze(1).repeat(1, N, 1) #[B,N,D]` that means Key Matrix has the same r…

kate123wong updated 6 months ago
1
e4exp/paper_manager_abstract #536

Going Beyond Linear Transformers with Recurrent Fast Weight …

- https://arxiv.org/abs/2106.06295 - 2021 線形化された注目を持つトランスフォーマー（以下、線形トランスフォーマー）は、90年代から外積ベースの高速重み付けプログラマー（FWP）の実用的なスケーラビリティと有効性を実証してきた。しかし、元々のFWPの定式化は、リニアトランスフォーマーのものよりも一般的なもので、低速のニューラルネットワーク（NN）…

e4exp updated 3 years ago
1
scikit-learn/scikit-learn #28437

Allow boosting of estimators in scikit-learn pipelines using…

### Describe the workflow you want to enable I want to be able to use multiple estimators in one pipeline. E.g. ```python from sklearn.pipeline import Pipeline from sklearn.linear_model impor…

zachmayer updated 1 month ago
7
leejet/stable-diffusion.cpp #421

Crushing when apply this lora

The app is crushing with no error when I apply this LORA: https://civitai.com/models/251417 `sd -m "D:\Stable-diffusion\ComfyUI\models\checkpoints\SDXL\himerosxl_v206.safetensors" --lora-model-…

razvanab updated 2 days ago
10

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for linear-transformer

1000+ results
for linear-transformer