mlp-architecture Search Results

1000+ results
for mlp-architecture

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

UCDvision/sima #3

Replace multi head attention in decoder

Hi, May I know whether I can use sima instead of multi head attention in decoder, to reduce complexity? Thanks!

Mareeta26 updated 2 years ago
9
PaddlePaddle/PaddleOCR #13951

code bug while training parseq in paddleOCR

### 🔎 Search before asking - [X] I have searched the PaddleOCR [Docs](https://paddlepaddle.github.io/PaddleOCR/) and found no similar bug report. - [X] I have searched the PaddleOCR [Issues](https…

SleepEarlyLiveLong updated 2 weeks ago
1
BennyTMT/LLMsForTimeSeries #4

I am confused

Thank you for the article. I have reproduced some of the results, and I plan to present this paper at the group meeting next week. However, I have some questions. After reading it, I feel a bit confus…

nightforwar updated 3 weeks ago
1
pratyushasharma/laser #4

Mistral Support

Hi, Great work on this! Is Mistral supported? Right now I only see GPT-J and Llama 2. Thank you!

fakerybakery updated 10 months ago
16
yyf17/NavigationProject #8

CVPR 2022

CVPR 2022 # 格式 * **Paper Title** *Author(s)* CVPR, 2022. [[Paper]](link) [[Code]](link) [[Website]](link) 需要填充： 1）Paper Title 2） Author(s) 3） 3个“link” 4）两篇文章之间间隔一行 # agent Meta Ag…

yyf17 updated 2 years ago
1
cl-challenge/clvision #7

훈련 technique

[ResNet Strikes back: An improved training procedure in timm](https://arxiv.org/abs/2110.00476) 논문에서 말하길 resnet도 최신 훈련 테크닉을 사용하면 최신 모델 결과에 밀리지 않는 결과를 낸다고 말함. 이를 위해 resnet에 사용한 훈련 테크닉들: 1. Data Augm…

solangii updated 2 years ago
4
chemprop/chemprop #806

[FEATURE]: Add option to make component order in multicompon…

At the MLPDS meeting someone brought up that in multicomponent the order of components currently matters because the learned representations are concatenated. Could we add an option to make the archit…

KnathanM updated 6 months ago
13
vllm-project/llm-compressor #870

How to load compressed model with vllm?

I utilized LLMCompressor to quantize our model using the FP8-dynamic recipe. The quantized model was successfully tested using the SparseAutoModelForCausalLM method. ![image](https://github.com/use…

IEI-mjx updated 1 week ago
9
karpathy/nanoGPT #299

Is there a bidirectional- gpt model available?

Does anyone have an implementation of a bidirectional gpt model? Like bi-lstm.

win10ogod updated 8 months ago
5
long8v/PTIR #187

[168] Proximal Policy Optimization Algorithms

[paper](https://arxiv.org/pdf/1707.06347) ## TL;DR - **I read this because.. :** 배경지식 차 - **task :** RL - **problem :** q-learning은 너무 불안정하고, trpo 는 상대적으로 복잡. data efficient하고 sclable한 arch…

long8v updated 2 months ago
1

上一页 1...29 30 31 32 33 34 35...100 下一页

1000+ results for mlp-architecture

1000+ results
for mlp-architecture