-
First of all, thank you for your great creation. Is there a 3D data version of MLP-Mixer
lxy51 updated
1 month ago
-
I just swap out Nero optimizer in my Lightning AI loop and gave the new Shampoo a try. There is something going on with it, as this card is typically able to do 2 it per second on almost anything. Old…
-
**Describe the bug**
estimate model size is different with nvidia-smi usage
**To Reproduce**
1. used code, and command line
2. The code will run on the cuda:2 device
```
import torch
impor…
-
Hi authors,
Thanks for your excellent works. However i have met some troubles in reproducing the results reported in paper. I found that there are two points may cause data leakage:
### 1. data le…
-
总计:8+14+9+26+21+10+3+6+6+10+3+9+11+4+88=228
领域 | 功能 | 基础模型 | 支持方式 | 负责人 | 状态 | 展开数量|Onelab负责人| OneLab公开项目链接
-- | -- | -- | -- | -- | -- | -- | -- | --
cv | classification | EfficientNet_b0| flowvis…
-
## Issues
When replay_gain_hander is specified as "mixer",
1. A value from `mpc volume` varies by itself after playing a song
2. A dB value written in replaygain tag is quite different from that ob…
-
- https://arxiv.org/abs/2110.02095
- 2021
近年の大規模機械学習の発展は、データ、モデルサイズ、学習時間を適切にスケールアップすることで、事前学習の改善がほとんどの下流のタスクに有利に移行することを観察することができることを示唆している。
本研究では、この現象を系統的に研究し、上流の精度を上げると下流のタスクの性能が飽和することを証明しました。
…
e4exp updated
3 years ago
-
字符识别infer时,出现错误 ,det 和 cls是正常的
RuntimeError: Error(s) in loading state_dict for BaseModel:
Missing key(s) in state_dict: "backbone.blocks.0.mid_se.conv1.weight", "backbone.blocks.0.mid_se.c…
-
왜 ViT 가 잘 working 할까에 대해 연구한 논문.
[paper](https://arxiv.org/abs/2202.06709)
일반적으로 생각하는 MSA 가 좋은 이유
```
MSA 의 어떤 부분이 모델을 위해 좋을까?
==> long range dependency
MSA가 conv 처럼 동작할까?
==> MSA 가 general…
-
# 論文情報
- [paper](https://arxiv.org/pdf/1512.03385.pdf)
- [github](https://github.com/KaimingHe/deep-residual-networks)