linear-transformer Search Results

1000+ results
for linear-transformer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

karpathy/minGPT #135

What is the purpose of `c_proj` here?

https://github.com/karpathy/minGPT/blob/37baab71b9abea1b76ab957409a1cc2fbfba8a26/mingpt/model.py#L42 Why do we need an additional linear transformation after the MHA and before the MLP when the dim…

brynhayder updated 1 month ago
1
huggingface/transformers #31461

linear_sum_assignment error in the object_detection.py guide

### System Info ``` root@fb9fa1e6d8d8:/mnt/nas2/users/sbchoi/transformers/examples/pytorch/object-detection# transformers-cli env Copy-and-paste the text below in your GitHub issue and FILL OUT…

SangbumChoi updated 2 weeks ago
2
patrick-kidger/equinox #771

Recommended way of filtering params for weight decay

Apologies if this has been asked before but I couldn't find any example that demonstrates this in a simple manner. I have a model built in Equinox. Now, I want to use the `AdamW` optimizer where: …

AakashKumarNain updated 12 hours ago
10
pytorch/pytorch #129457

[PT2][fp8][FSDP2] compile the function that pre-computes fp8…

### 🚀 The feature, motivation and pitch share repro for @bdhirsh , @tugsbayasgalan on the gaps of torch.compile for FSDP2 fp8 all-gather for FSDP2 fp8 all-gather, it's criticial to pre-compute ama…

weifengpy updated 1 day ago
3
scikit-learn/scikit-learn #29361

TransformedTargetRegressor warns about set_output set to pan…

### Describe the bug If `set_output` is set to `"pandas"`, `TransformedTargetRegressor` warns unnecessarily. ### Steps/Code to Reproduce ```python import numpy as np import pandas as pd from skl…

lorentzenchr updated 1 day ago
1
NVIDIA/TransformerEngine #965

How to cast 16/32-bit to FP8?

Hi, how to cast a float/bfloat16 tensor to fp8? I want to conduct W8A8 (fp8) quantization. But I didn't find an example of quantizing act to FP8 format.

mxjmtxrm updated 6 days ago
3
nlpyang/PreSumm #87

loading state_dict for AbsSummarizer

Traceback (most recent call last): File "train.py", line 135, in test_abs(args, device_id, cp, step) File "E:\project\PreSumm\src\train_abstractive.py", line 215, in test_abs model = …

kurodenjiro updated 4 years ago
1
bowang-lab/scGPT #198

scGPT must be need GPU?

Hi developer, Thanks to develop the great tool to annotate the single cell, I wander that this scGPT must be need GPU on centos7.9, and i hadn't the GPU, what about CPU to use this scG…

honghh2018 updated 1 month ago
1
NVIDIA/TransformerEngine #230

FP8 vs FP16 performance (seq2seq transformer with te.Linear …

Here is what I am getting (see below) FP8 slower than FP16 for FP16, multiples of 16 make things slower than multiple of 8 Am I missing something ? Batch_size_multiple 16 // Seqlen multi…

vince62s updated 1 year ago
3
NVIDIA/TransformerEngine #948

how to use FusedRMSNorm?

hi, TE is really a great job. how to use in FusedRMSNorm in TE? https://github.com/NVIDIA/apex/blob/master/apex/normalization/fused_layer_norm.py#L329

EthanChen1234 updated 1 week ago
1

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for linear-transformer

1000+ results
for linear-transformer