-
# 📜 [A Survey of Transformers](https://arxiv.org/pdf/2106.04554.pdf)
### ⚡ 한줄요약
2021년 6월 기준으로 정리한 transformer 아키텍쳐에 대한 서베이 논문.
### 🏷️ Abstract
> Transformers have achieved great success in …
-
As titled. Thank you !!!
-
First thanks for your great job!
Now We're trying to replace the vision encoder in llava, i.e., clip-l-336, with RADIO. Under the default LLaVA 1.5 settings, we pretrain a multimodal projection MLP a…
-
### 🚀 The feature
Support of multiple languages (accordingly VOCABS["multilingual"]) by pretrained models.
### Motivation, pitch
It would be great to use models which supports multiple languages b…
-
## ❓Question
Hi there,
I recently created a [basic implement](https://github.com/lishicheng1996/coremltools/tree/paddle_frontend) for converting a [PaddlePaddle](https://github.com/PaddlePaddle/…
-
will start with
1. FILIP https://arxiv.org/abs/2111.07783
2. CLOOB https://arxiv.org/abs/2110.11316
3. https://arxiv.org/abs/2110.05208
-
# Summary
NLP 성능을 LLM 수준으로 유지시키면서 VLM을 scratch로 학습시키는 건 굉장히 어려움. 따라서, frozen pretrained language model로부터 어떤 식으로 VLM을 학습시키는지를 investigate하는 방향으로 연구가 진행되어 옴.
### 기존 연구 방향
1. Shallow alignmen…
-
### Description
Extreme narrow layout produced in normal body
### (Optional:) Please add any files, screenshots, or other information here.
_No response_
### (Required) What is this issue most clo…
-
I am trying to apply SmoothQuant during W8A8 quantization of `meta-llama/Llama-3.2-11B-Vision-Instruct` where I ignore all of the modules except for language_model. However I find that it crashes when…
-
Thanks for your amazing sharing. I am a novice to VLN, but still motivated by your ideas. I notice that it is inevitable for an agent to make mistakes, which come mainly from the mismatching of sub-in…