Open alyakin314 opened 4 years ago
Not sure if bug or enhancement.
After further conversation with @jagterberg , it seems that what should happen is something different:
First, SeedlessProcrustes needs to feature another parameter, let's call it q
for now. The optimal transport is performed in an unmodified way. However, the procrustes part of it should take the first q
components and perform procrustes on them, then take the last d-q
components and perform procrustes on them. Lastly, the two matrices are assembled into a single one using a direct sum. This results in a block-diagonal matrix. The rest of the algorithm is unmodified. This will be used both by directed and indefinite versions.
For the SeedlessProcrustes with directed graphs, q
just needs to be set to d / 2
. Note that d is not the dimension we embed in, but the dimension we end up with after concatenating, so it's 2x that of which we embed into.
For the Indefinite SeedlessProcrustes, what needs to happen is the following:
d
)d
) be q
.q=q
.There are other ways of implementing this: for example, one can instead be providing a vector of labels, associated with whether the corresponding eigenvalue is positive of negative: (e.g. eigenvalues [-0.5. 0.2, -0.1, 0.01]. would yield this [0, 1, 0, 1]).
@jagterberg is what i said above correct?
@bdpedigo @j1c @dwaynepryce i can't think of a good API for the indefinite case. We might have to create something that wraps embedding and aligning together.
First, SeedlessProcrustes needs to feature another parameter, let's call it
q
for now. The optimal transport is performed in an unmodified way. However, the procrustes part of it should take the firstq
components and perform procrustes on them, then take the lastd-q
components and perform procrustes on them. Lastly, the two matrices are assembled into a single one using a direct sum. This results in a block-diagonal matrix. The rest of the algorithm is unmodified. This will be used both by directed and indefinite versions.
I should've added that it basically should mirror the R implementation here: https://github.com/jagterberg/nonparGraphTesting/blob/68154f1d5891b1ac914241e99c626f274b17ac4f/R/optimal_transport.R#L56-L73
Expected Behavior
When using SeedlessProcrustes to align two directed embeddings, rather than stacking the two embeddings together and then using seedless, one should first use seedless on them independently, and then stack them. the motivation here is that otherwise the information from one part of directedness ("from") will spill into the other part ("to") and we do not want that.This will require some reordering and reorganization of the LDT code (and that's it).Actual Behavior
This totally slipped out of my mind, and we are currently stacking the embeddings and subsequently calling the appropriate aligner.upd: see two posts down for what should actually happen.