feedforward Search Results

1000+ results
for feedforward

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #125674

[BUG]Nan in gradients of scaled_dot_product_attention operat…

### 🐛 Describe the bug # reproduce the bug @mstebelev found out that memory efficient attention kernel on float32 cuda tensors gives nan gradients despite inputs and incoming gradient are reaso…

walkacross updated 1 day ago
4
zysszy/TreeGen #10

Why use WordConv (separable convolution) in NL encoder and n…

Hi, I was wondering, why use WordConv (separable convolution) in NL encoder and not the usual Feedforward NN (like original transformer)? Is it mainly because separable conv is easier to train? Did…

brando90 updated 3 years ago
7
the-turing-way/the-turing-way #3361

Section on how to provide/receive feedback

### Summary We mention the 'feedback process' in the [Building Healthy Leadership Skills](https://the-turing-way.netlify.app/collaboration/leadership/leadership-building.html) section, but we don't…

EstherPlomp updated 5 months ago
5
deeplearning4j/deeplearning4j #5761

[suggestion] Could some functions be synchronized between Mu…

Some functions do almost the same thing across these two classes of models but have slightly different syntax and/or arguments, return types. Maybe the following functions could be made abstract and p…

yquemener updated 5 years ago
6
Tencent/ncnn #5057

pnnx convert to ncnn failed

## error log | 日志或报错信息 | ログ (bk-sdm) :~/pnnx/build/install/bin$ ./pnnx ~/diffusers-ncnn/model/unet-fp16.pt inputshape=[1,4,32,32],[1],[1,77,768] pnnxparam = ~/diffusers_ncnn/model/unet_fp16.pnnx.par…

zhongxiaoyu updated 1 month ago
2
just632/cinderella-1942 #3

my main issue with the current state of the repo

# what makes code DRY ? "Don't repeat yourself" (DRY) is a principle of software development aimed at reducing repetition of information which is likely to change, replacing it with abstractions th…

just632 updated 9 months ago
1
k2-fsa/icefall #1388

Major Difference in the output of Torch Model and Onnx Model…

Hi, I have trained zipformer2 (without streaming) model with my dataset. Training command: **./zipformer/train.py --num-epochs 40 --start-epoch 1 --use-fp16 1 --enable-musan False --exp-dir zip…

bhaswa updated 10 months ago
13
NX-AI/xlstm #53

How to run the code with a certain batch?

This code is working: ``` import torch import pdb from xlstm import ( xLSTMBlockStack, xLSTMBlockStackConfig, mLSTMBlockConfig, mLSTMLayerConfig, sLSTMBlockConfig, …

Cram3r95 updated 1 week ago
1
RubixML/ML #208

Getting "undefined method Tensor\Matrix::maximum()" error us…

Hey, this isn't a pressing issue for me as i'm happy to use the other optimisers which are working fine. With some settings i occasionally get some errors from what i guess is the Tensor extension. Be…

MarkLuds updated 2 years ago
3
salesforce/LAVIS #414

About instructblip's train detail

In the paper, you say "Since the original BLIP-2 models do not include checkpoints for Vicuna, we perform pre-training with Vicuna using the same procedure as BLIP-2". Is this means instructblip trai…

zdxff updated 1 year ago
1

上一页 1...13 14 15 16 17 18 19...100 下一页

1000+ results for feedforward

1000+ results
for feedforward