Centroid of Centroids using NanoPQ

Not sure such a nested PQ is useful of not, becuase a PQ with an increased parameter would be usually better. But the following nested PQ should work.

import nanopq
import numpy as np

N, D = 1000, 24
X = np.random.random((N, D)).astype(np.float32)  # 1,000 24-dim vectors

# Instantiate with M=4 sub-spaces, with the number of centrods per sub-space is Ks=16
M, Ks = 4, 16
pq = nanopq.PQ(M=M, Ks=Ks)

# Train codewords
pq.fit(X)

# codewords
# The shape is (4, 16, 6), this means that:
# - 4 supspaces
# - 16 codewords for each supspace
# - A codeword is a 6-dim vector
print(pq.codewords.shape)  

# Given the codewords, train second-level PQ instances
# For each subspace, create a PQ instance, with M=2 and Ks=4
second_level_pqs = []
for m in range(M):
    second_level_pq = nanopq.PQ(M=2, Ks=4)
    second_level_pq.fit(pq.codewords[m])  # Train by corresponding codewords
    second_level_pqs.append(second_level_pq)

# Check
print(second_level_pqs[0].codewords.shape) # shape = (2, 4, 3)

matsui528 / nanopq

Centroid of Centroids using NanoPQ #7