mlfoundations / open_clip

An open source implementation of CLIP.
Other
10.24k stars 979 forks source link

Questions regarding the implementation of SigLIP loss #944

Closed binwang777 closed 1 month ago

binwang777 commented 1 month ago

Why are leftrank and rightrank not updated during the bidirectional operation? like this:

        if bidir:
            text_features_to_right = text_features_to_left = text_features

            num_bidir, remainder = divmod(self.world_size - 1, 2)
            for i in range(num_bidir):

                text_features_recv = neighbour_exchange_bidir_with_grad(
                    left_rank,
                    right_rank,
                    text_features_to_left,
                    text_features_to_right,
                )

                for f in text_features_recv:
                    loss += self._loss(
                        image_features,
                        f,
                        logit_scale,
                        logit_bias,
                        negative_only=True,
                    )
                text_features_to_left, text_features_to_right = text_features_recv

                left_rank = (left_rank - 1 + self.world_size) % self.world_size
                right_rank = (right_rank + 1) % self.world_size
rwightman commented 1 month ago

each node receives to/from their neighbours, it's a neighbour exchange algorithm, you can't change who's to the left and right of you... imagine people sitting in a circle of chairs passing potatoes to the left and right, same person is always to the left of you and right of you but the potatoes make their way around the circle...