scikit-tda / DREiMac

Dimensionality Reduction with Eilenberg-MacLane Coordinates
Apache License 2.0
37 stars 13 forks source link

Circular coordinates on new points #9

Open wreise opened 1 year ago

wreise commented 1 year ago

Hello :wave:,

Thank you for this great implementation!

I would be interested in estimating the coordinates on a sample X and thenevaluating them at some new points X_new, that are in principle, close to the original ones. Is that feature available? Is it meaningful?

Looking at examples, I understood that what CircularCoordinates(X, n_landmarks).get_coordinates() returns is the evaluation of circular coordinates on X. Treating your code as a block-box, a first approach is to use nearest-neighbors to map X_new to X and then retrieve the coordinates. However, I'm thinking that one can certainly do better: if i understand well, you compute cohomology on landmarks L and you construct the coordinates on X. Is there an obstruction to using a different set of points, as long as it is contained in the offset of L? Would it be possible to obtain the function described in the output section?

If such a feature is not available, but you think it makes sense and it would be a good addition, I would happily contribute. Thank you for your help.

LuisScoccola commented 1 year ago

Hi @wreise. Such a feature is not currently available, but, having thought about it a bit, I don't see a fundamental obstruction to having such a feature. I also think it would be a useful addition. DREiMac's computations only deal with the distance from all points to landmarks, so it should be possible to evaluate the coordinates on a new data point as long as we have the distance from that point to the landmarks (and, as you say, also as long as it is contained in the offset of the landmarks). This should all be fine for circular and toroidal coordinates; the case of projective coordinates is slightly more involved, but I think circular and toroidal coordinates is a good start (note that circular coordinates is just a wrapper around toroidal coordinates).

If you are interested in working on this, that would be great! Feel free to keep the conversation going here or to contact me directly for more technical questions or details.

If you want a pointer, you can start by looking at the toroidal coordinates module. The first part of the computation is just working with landmarks and cocycles on landmarks. The coordinates on the the dataset are computed with the sparse integrate function, which will work on any data point as long as you can evaluate the partition of unity on it (which in turn only depends on the distance to the landmarks).

wreise commented 1 year ago

Hi @LuisScoccola ! Thank you for your positive reply and the pointers.

DREiMac's computations only deal with the distance from all points to landmarks, so it should be possible to evaluate the coordinates on a new data point as long as we have the distance from that point to the landmarks (...) The first part of the computation is just working with landmarks and cocycles on landmarks. The coordinates on the the dataset are computed with the sparse integrate function, which will work on any data point as long as you can evaluate the partition of unity on it (which in turn only depends on the distance to the landmarks).

Yes, that was roughly what I understood.

If you don't mind, this can stay open and I will post here or open a PR when I take it up.