Cornell-RelaxML / quip-sharp

GNU General Public License v3.0
477 stars 42 forks source link

Question about the ROUND operation #69

Open CPegasus opened 1 month ago

CPegasus commented 1 month ago

Thanks for open source such an awesome work! I have a question about the round operation in class E8P12RVQ4B_codebook, which is below

    def round(self, X, grid, grid_norm):
        assert X.shape[-1] == self.codesz
        Xqidx = (2 * X @ grid.T - grid_norm).argmax(-1)
        return grid[Xqidx], Xqidx

I guess it respresents a kind of similarity between X and grid computed by considering the norm of grid, but I don't know what that means exactly, could you explain it in detail? Why not consider the norm of X?

tsengalb99 commented 1 month ago

This function quantizes X to the nearest element in grid by euclidean distance. The (2 * X @ grid.T - grid_norm).argmax(-1) part just takes advantage of the fact that for a single X, all the euclidean norms share a ||X|| component, so we can avoid computing that or manifesting the outer product of X and grid.