-
Thanks for your amazing sharing. I am a novice to VLN, but still motivated by your ideas. I notice that it is inevitable for an agent to make mistakes, which come mainly from the mismatching of sub-in…
-
### Model description
# Escaping the Big Data Paradigm with Compact Transformers
Abstract :
> With the rise of Transformers as the standard for language processing, and their advancements in …
-
Updates from:
- https://github.com/jacobhilton/deep_learning_curriculum (focus on transformers)
- Raschka book
1. Math prerequisites
Taking a derivative to find a point of minimum or maxim…
-
https://github.com/xxxnell/how-do-vits-work/blob/8752f4e330a38877c628dfa40d57fa9404bb3131/models/convit.py#L1-L6
You said it's not the same with [ConVit by d'Ascoli, Stéphane, et al](https://arxiv.…
-
### Model description
X-Decoder is a generalized decoding pipeline that can predict pixel-level segmentation and language tokens seamlessly. X-Decoder is the first work that provides a unified way to…
-
Post questions here for this week's fundamental readings:
J. Evans and B. Desikan. 2022. “Deep Learning?” and “Deep Neural network models of text”, Thinking with Deep Learning, chapter 1, 9
Ash…
lkcao updated
6 months ago
-
Hi!
Let's bring the documentation to all the Korean-speaking community 🌏 (currently 9 out of 77 complete)
Would you want to translate? Please follow the 🤗 [TRANSLATING guide](https://github.com…
-
## Why
Machine Learning 輪講は最新の技術や論文を追うことで、エンジニアが「技術で解決できること」のレベルをあげていくことを目的にした会です。
prev. #50
## What
話したいことがある人はここにコメントしましょう!
面白いものを見つけた時点でとりあえず話すという宣言だけでもしましょう!
-
### Model description
Hi! I'm the author of ["Prismatic VLMs"](https://github.com/TRI-ML/prismatic-vlms), our upcoming ICML paper that introduces and ablates design choices of visually-conditioned …
siddk updated
4 months ago
-
# BLIP
* [paper](https://arxiv.org/abs/2201.12086)
* [code](https://github.com/salesforce/BLIP)
* [blog](https://blog.salesforceairesearch.com/blip-bootstrapping-language-image-pretraining/)
* i…