-
* https://medium.com/data-science-at-microsoft/visual-question-answering-with-multimodal-transformers-d4f57950c867
* https://www.kaggle.com/datasets/bhavikardeshna/visual-question-answering-computer-…
-
### Question
It says "Dataset date: LLaVA Visual Instruct 150K was collected in April 2023, by prompting GPT-4-0314 API."
https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K
Did you guys…
-
How do I understand $`E_{Sam}`$ and the corresponding $`E^{T-I}_{Sam}`$ in the paper? Is it constructed using the positional embedding in the transformer like the learnable embedding $`E_{Pos}`$ etc. …
-
NusaCatalogue: https://indonlp.github.io/nusa-catalogue/card.html?id_mm_pmd
| Dataset | id_mm_pmd |
|-------------|---|
| Description | Introduced in the FLAVA paper, Public Mu…
-
### Feature request
I'm a huge fan of the current HF Datasets `webdataset` integration (especially the built-in streaming support). However, I'd love to upload some robotics and multimodal datasets I…
siddk updated
4 months ago
-
In the experimental section of your paper, the results of single modality methods on multimodal datasets are reported. Could you please clarify how these single modality methods were specifically impl…
-
- ~Using L1 regularization instead L2 to induce more sparseness/interpretability~
- ~Measure causality of final linear layer~
- Explicitly limit the residual component in PCBM-h
- Test finetuned CL…
dgcnz updated
6 months ago
-
Hi Marcel,
I was wondering - are you planning on combining multiple datasets or just using one multimodal dataset?
-
[kgbench: A Collection of Datasets for Multimodal and Relational Learning on Heterogeneous Knowledge](https://openreview.net/forum?id=yeK_9wxRDbA) ([pdf](https://openreview.net/pdf?id=yeK_9wxRDbA), [G…
-
Flamingo model initialized with 23461888 trainable parameters
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /home/jovyan/taoheng/work/Multimodal-GPT/mmgpt/tr…