Closed ericyq closed 4 months ago
is to insert the Part CLS Token between the G lobal CLS Tokn and the Visual Token
Can the subscript number 1 become the last one? or can the subscript 1 be arbitrary?
It is not possible to slice x in this way, because x[:,0] represents the CLS token
if args.use_div : x = torch.cat([x[:,:1],self.part_class_embedding.to(x.dtype) + torch.zeros(x.shape[0], 1, x.shape[-1], dtype=x.dtype, device=x.device) ,x[:,1:]],dim=1)