Open GinnyXiao opened 1 month ago
For the explanation of offset
, take a look at #23 .
Based on the explanation of offset
, you may understand that we use for-loop for inferencing SAM because items of feat
has different shapes and can't be concated to a tensor for batch inference.
Dear authors,
I wanted to express my gratitude to you again! Your work immensely inspired me.
I was wondering if you could kindly explain the relationship between the variables
offset
,batch_size
, andlen(feat)
? What doesoffset
do and why doesbatch_size == len(offset) - 1
? Doeslen(feat)
equal tobatch_size
?Also, from your code I understand that BEiT3 can process a batch of image-text inputs, but SAM 2 does not support batch processing? (You used a for-loop.) For example, can SAM support parallel processing of: