SamsungLabs / AdaCLIP

This repository contains the code for AdaCLIP, a computation and latency-aware system for pragmatic multimodal video retrieval.
Other
10 stars 2 forks source link

frame selection and frame aggregation #2

Open Arbor334 opened 1 month ago

Arbor334 commented 1 month ago

Great project, I looked at your model.py and train.py files Can you pinpoint the core code for frame selection and frame aggregation in the file?:happy:

angelaaye commented 1 month ago

Hi @Arbor334, frame selection is done here, which calls modeling/sampler.py to sample the frames.

Frame aggregation is done here in the similarity matrix computation step.

Arbor334 commented 4 weeks ago

thanks!

Arbor334 commented 2 weeks ago

gumbel_softmax_top_k(logits, top_k, tau, hard=False, reduce_sum=True, maskvector=True, logit_tau=1): Is this function from modeling/gumbel_softmax.py used to select top-k frames? and actions = gumbel_softmax_top_k(prob, int(self.top_k), tau, True, reduce_sum=False) What does the return value actions mean? select_frames or not ? :smiley: