we should only calculate logliks for cells above the cosine distance threshold for at least one profile
additionally: for any given cell, we only calculate logliks for the profiles with the 2-3 highest cosine distances
(The goal is to avoid calculating logliks for all cells * profiles, which is slow.)
Plan:
calc cosine for all
build a matrix of which cells will get their loglik calculated: 1. rows with at least one cosine above the threshold, 2. within those rows, only columns with cosines > 0.5 * threshold.
(The goal is to avoid calculating logliks for all cells * profiles, which is slow.)
Plan: