Closed Jai2500 closed 2 years ago
Minutes of Meeting
Learnings:
Potential Problems:
sparse activation
or Switch Transformers
or somehow basically reduce this $N \times K$ to something else. Potential Things:
S
matrix is now binary, the adjacency matrix of the clusters will also be binary and thus we can save more space through using the correct dtypes
. We are pretty sure that we can go deeper, so closing this issue and proceeding with development.
We need to understand fully what we gaining out of sampling. Broadly, it's the ability to have fewer edges and preserved, crisp, graph structures.