mlp-architecture Search Results

1000+ results
for mlp-architecture

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

bigscience-workshop/Megatron-DeepSpeed #138

clone HF's `GPT2` to create `GPTMeg` with a few tiny changes…

As can be seen from https://github.com/bigscience-workshop/Megatron-DeepSpeed/pull/121 we have a divergence between Meg and HF GPT2, while using the same weights under fp16. So the proposed soluti…

stas00 updated 3 years ago
6
flojoy-ai/studio #823

RFC `Nodes` and `DataContainers` extension for supporting sc…

### Context: scikit-learn's usage and specificities While the current `Nodes` and `DataContainers` this is sufficient for most library like SciPy and NumPy which can entirely be used with free func…

jjerphan updated 1 year ago
13
andreasbinder/Point-GNN-PyTorch #3

Errors I faced while attempting run

Hi, thanks for sharing your code here. I have attempted to run below code from your [answer](https://github.com/andreasbinder/Point-GNN-PyTorch/issues/1#issuecomment-1482478162) in issue #1 : ```[…

rodroadl updated 6 months ago
1
center-for-humans-and-machines/transformer-heads #4

Question regarding the MLP Regression head architecture diag…

1. In the above image for MLP regression head, it has been shown there are two linear layers inside the regression head. Is this predefined? or can we modify the number of linear layers? if the…

ArchchanaKugathasan updated 2 months ago
2
metatensor/metatrain #336

How to train a PET model with the current metatrain version?

Hello together, I am trying to train a PET model with the current version of metatrain. I have set up a new environment and followed the installing instructions. However I do not find an input f…

bananenpampe updated 2 days ago
6
greenelab/deep-review #58

MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional A…

https://arxiv.org/abs/1605.03004

agitter updated 6 years ago
1
csguoh/MambaIR #69

Lightweight models broken as of 15 October

As of 15 october (commit 06dc6cdd2fd87df0c4462603daa6bb6d1c43e7b3 and c308c30f1e8d81153547378012e61ab86d7e2ef4), the model architecture for the lightweight models is incompatible with the one present …

umbertov updated 1 week ago
2
kozistr/pytorch_optimizer #100

Updated Shampoo uber slow performance

I just swap out Nero optimizer in my Lightning AI loop and gave the new Shampoo a try. There is something going on with it, as this card is typically able to do 2 it per second on almost anything. Old…

redknightlois updated 1 year ago
10
pytorch/pytorch #115484

torch.compile() breaks when using DeepSpeed ZeRO Level 3 sha…

### 🐛 Describe the bug torch.compile() breaks when using DeepSpeed ZeRO Level 3 sharding. I am fine-tuning Llama 2 using the Transformers codebase, and added a `torch.compile()` decorator over the ML…

rosario-purple updated 4 months ago
7
InternLM/lmdeploy #2221

[Bug] IntrenVL2-1B awq量化后推理异常问题

### Checklist - [X] 1. I have searched related issues but cannot get the expected help. - [X] 2. The bug has not been fixed in the latest version. - [X] 3. Please note that if the bug-related issue y…

Jeremy-J-J updated 1 month ago
9

上一页 1...44 45 46 47 48 49 50...100 下一页

1000+ results for mlp-architecture

1000+ results
for mlp-architecture