Closed sandeepmukh closed 1 week ago
@sandeepmukh I think a few things wrong for this .... first, update to main branch.
Then, I think this is needed in CocaModel to replace current vocab_size logic btw text and multimodal text towers
if getattr(text_cfg, "hf_model_name", None) is not None:
vocab_size = getattr(self.text, "vocab_size", text_cfg.vocab_size)
else:
vocab_size = text_cfg.vocab_size
Also, the context_len used by tokenzier sources from text_cfg by default, so text_cfg and multimodal_cfg should have same context_len values in config (I think) to work best but I'm not 100% sure there.
Hi! I'm trying to train CoCa using the pretrained RoBERTa weights (has the casual masking issue #445 been addressed?), but I am running into an error with the Attention Maps sizes. Any help would be greatly appreciated :).
Below is the command I'm running:
However, this errors:
Inspecting the error, I tried to change the multi-modal context length to 77, which yields the following error: