allenai / unified-io-2

Apache License 2.0
572 stars 27 forks source link

model architecture #13

Closed annopackage closed 9 months ago

annopackage commented 9 months ago

Hi, thanks for your amazing work. The model of both unified-io and unified-io-2 contain encoder and decoder, could you kindly share more details why you choose this kind of architecture. I wonder if you have try decoder-only architecture, and can the training objective of UL2 be well compatible with decoder-only model design?

jiasenlu commented 9 months ago

Great question! The decoder-only model is indeed more popular and used in most of the recent systems. From my personal view, the difference between encoder-decoder and decoder-only models is not that significant, considering the visual backbone. You can always consider the visual feature (ViT) as an encoder in the decoder-model. Part of our efforts is focused on training a generalized encoder that can take many different modalities. So, that is why we chose the encoder-decoder system in our project.