draeloslab / AdaptiveLatents

GNU General Public License v3.0
2 stars 0 forks source link

Investigate QR vs SVD initialization of proSVD #19

Open Jonathan-Gould opened 2 weeks ago

Jonathan-Gould commented 2 weeks ago

Commit c727071 changes the proSVD initialization method from QR to SVD. We need to check: 1) whether this is a costly change; does it slow down the initialization? 2) whether this is a necessary change; how much variance does the first-k subspace of a QR decomposition capture? The old initialization method could be fine, and if it's faster, we should stick with that 3) if it helps the interpretability of the Q's we discover.

Jonathan-Gould commented 2 weeks ago

This is initial rough code I used to justify the change; it might be useful to refine:

from adaptive_latents import proSVD
from adaptive_latents.utils import column_space_distance
import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng()

if __name__ == '__main__':
    tries = 200
    hits = 0
    for i in range(tries):
        x = rng.normal(size=(5,100))
        x[-1,:] *= 2

        pro1 = proSVD(k=1)
        pro1.initialize(x)
        initialization_points_correct_direction = np.argmax(np.abs(pro1.Q[:, 0])) == 4

        pro2 = proSVD(k=1)
        pro2.initialize(x[:,:-1])
        pro2.updateSVD(x[:,-1:])
        update_points_correct_direction = np.argmax(np.abs(pro2.Q[:, 0])) == 4

        if update_points_correct_direction and not initialization_points_correct_direction:
            hits += 1

    print(hits/tries)