Yuqifan1117 / CaCao

This is the official repository for the paper "Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World" (Accepted by ICCV 2023)
40 stars 5 forks source link

EPIC seq_length #19

Open zhangjingxian1998 opened 7 months ago

zhangjingxian1998 commented 7 months ago

https://github.com/Yuqifan1117/CaCao/issues/18#issuecomment-1851747094 I have a question about this statement: after encoding images and texts using clip, the shapes are [B,50,768] and [B,77,512] respectively. How can I set their seq_length to 4 and 2?