An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
On which part of the code is tracking actually happening.
@torch.no_grad() def track(self, image): print("Inside Main Track") output_height, output_width = image.shape[0], image.shape[1] sample = {'current_img': image} sample = self.transform(sample) image = sample[0]['current_img'].unsqueeze(0).float().cuda(self.gpu_id) print('calling mp') self.engine.match_propogate_one_frame(image) ## calls pred_logit = self.engine.decode_current_logits((output_height, output_width))
Here is it pred_logit? Also LSTT forward has nothing to do with tracking right? it is just adding reference ? Please help