-
- https://arxiv.org/abs/2103.04037
- 2021
トランスフォーマーアーキテクチャは、長年リカレントニューラルネットワークに支配されていた計算言語学の分野に根本的な変化をもたらしました。
その成功は、言語と視覚のクロスモーダルなタスクにも劇的な変化をもたらし、多くの研究者がすでにこの問題に取り組んでいます。
本論文では、この分野における最も重要なマイル…
e4exp updated
3 years ago
-
I'm running finetune_onevision.sh to finetune on my dataset and I get this error:
Traceback (most recent call last):
File "/home/ubuntu/LLaVA-NeXT/llava/train/train_mem.py", line 4, in
tra…
-
- [ ] [MoAI/README.md at master · ByungKwanLee/MoAI](https://github.com/ByungKwanLee/MoAI/blob/master/README.md?plain=1)
# MoAI/README.md at master · ByungKwanLee/MoAI
## Description
![MoAI: Mixture…
-
Hi!
Let's bring the documentation to all the Korean-speaking community 🌏 (currently 9 out of 77 complete)
Would you want to translate? Please follow the 🤗 [TRANSLATING guide](https://github.com…
-
I tried to run the demo on multiple RTX 3090 but got strange errors:
```
python3.10/site-packages/transformers/cache_utils.py", line 146, in update
self.key_cache[layer_idx] = torch.cat([self.k…
-
# 📜 [A Survey of Transformers](https://arxiv.org/pdf/2106.04554.pdf)
### ⚡ 한줄요약
2021년 6월 기준으로 정리한 transformer 아키텍쳐에 대한 서베이 논문.
### 🏷️ Abstract
> Transformers have achieved great success in …
-
We need to convert keras.io examples to work with Keras 3.
This involves two stages:
## Stage 1: tf.keras backwards compatibility check
Keras 3 is intended as a drop-in replacement for tf.ker…
-
### Model Series
Qwen2.5
### What are the models used?
Qwen2.5-0.5B-Instruct
### What is the scenario where the problem happened?
train Qwen2.5-0.5B-Instruct in transformers library for vision la…
-
### Problem
We want to add support for this new model that unlike the previous ones also supports vision. The readme for the model is described below:
---
language:
- en
- de
- fr
- it
- pt…
-
Is this out of scope? I hope not, would be nice to have a one-stop shop for interpretability tooling.
### Proposal
It should be easy to get the most bare-bones interpretability research off the…