Open Jonathan-Gould opened 2 weeks ago
This is initial rough code I used to justify the change; it might be useful to refine:
from adaptive_latents import proSVD
from adaptive_latents.utils import column_space_distance
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.default_rng()
if __name__ == '__main__':
tries = 200
hits = 0
for i in range(tries):
x = rng.normal(size=(5,100))
x[-1,:] *= 2
pro1 = proSVD(k=1)
pro1.initialize(x)
initialization_points_correct_direction = np.argmax(np.abs(pro1.Q[:, 0])) == 4
pro2 = proSVD(k=1)
pro2.initialize(x[:,:-1])
pro2.updateSVD(x[:,-1:])
update_points_correct_direction = np.argmax(np.abs(pro2.Q[:, 0])) == 4
if update_points_correct_direction and not initialization_points_correct_direction:
hits += 1
print(hits/tries)
Commit c727071 changes the proSVD initialization method from QR to SVD. We need to check: 1) whether this is a costly change; does it slow down the initialization? 2) whether this is a necessary change; how much variance does the first-k subspace of a QR decomposition capture? The old initialization method could be fine, and if it's faster, we should stick with that 3) if it helps the interpretability of the Q's we discover.