invictus717 / MiCo

Explore the Limits of Omni-modal Pretraining at Scale
https://invictus717.github.io/MiCo/
Apache License 2.0
63 stars 3 forks source link

Some questions about the paper #3

Open handsomelys opened 1 week ago

handsomelys commented 1 week ago

How do I understand $E_{Sam}$ and the corresponding $E^{T-I}_{Sam}$ in the paper? Is it constructed using the positional embedding in the transformer like the learnable embedding $E_{Pos}$ etc. mentioned above?

invictus717 commented 1 week ago

Exactly. They are just random initialized vanilla positional embeddings.

invictus717 commented 1 week ago

These paired embeddings share the same weights to label the corresponding text paired datasets

handsomelys commented 1 week ago

These paired embeddings share the same weights to label the corresponding text paired datasets

Thanks Reply!

handsomelys commented 1 week ago

Thank you very much for your enthusiastic reply. But I still have some questions about pretraining objectives: Are the two terms in Formula 4 not equivalent? Why?

image

What is the value of predictions $p_v$ in Formula 5? Is the value of $p_v$ obtained directly through an MLP layer or will it go through an activation function similar to sigmoid?

image

How is the conditional causal masked mentioned in Formula 6 done? Is it to mask the last 60% of all tokens, and then use BERT to reconstruct the masked tokens in an autoregressive manner ?

image

invictus717 commented 1 week ago

Because they' re two matrix of text and multimodal features. Their dot products are transposed matrix. So for the columns and rows, the summations are different, especially dealing with a huge batch size.

invictus717 commented 1 week ago

The match process is provided in our released code:

https://github.com/invictus717/MiCo/blob/89c91c9dac68125a18a1a966bd80f9e74e584e80/model/mico.py#L44

invictus717 commented 1 week ago

The causal pretraining process is exactly as you say, which is intuitive and simple.

invictus717 commented 1 week ago

If you have any further questions, please feel free to reach out.

handsomelys commented 1 week ago

If you have any further questions, please feel free to reach out.

Thanks again for your reply!