Open bionicles opened 4 years ago
hi @bionicles , regarding the FGW
function, the first parameters M, C1, C2
refer respectively to distances between attributes of your 'atom' (assuming X ?), and inner distances in the two molecules graphs. ns
and nt
are the probability weights associated to the 'atoms' (I guess that without prior knowledge about this, a uniform distribution could be used).
Regarding the second part of your message, I guess it is less related to POT and more to open research issues. We, as a research team, could be interested by this kind of application. We have some codes (not in POT) running with Keops, gradient descent over partial GW/FGW and fast approximations of GW/FGW under tests right now. Feel free to send me a private email so we can eventually discuss how we can assist you with your problem.
right, M can be feature distance, C1, C2 could be pairwise distances between atoms
What I'm wondering about is, can we maintain a last dimension of the data?
instead of (ns, nt) (ns, ns) (nt, nt) I wonder if it's better to use (ns, nt, 8) (ns, ns, 8) (nt, nt, 8) to preserve information about specific components of "Atom vectors"
anyway here's a stab at the keops, but I'm not sure how to initialize and optimize pi:
inputs: a = vi(0, 8) b = vi(1, 8)
optimize: pi = ??? soft assignment matrix (a.shape[0], b.shape[0])
fgw = lambda a, b: sum( (a[i] | b[k]) + ( (a[i] | a[j]) | (b[k] | b[l] ) pi[i, k] pi[j, l] )
The main issue for our use case we run out of GPU memory. KeOps exists to solve that problem by only doing part of the problem on the GPU at a time and adding up the results (blockwise) but KeOps doesn’t work for FGW in a straightforward way because it doesn’t have a way to do Vij indexing of the pi soft assignment matrix between atoms in two different molecules
If you add block wise GPU calculations here, with NVIDIA unified memory sharing between CPU/GPU, then we could scale POT to arbitrary transport problems
One way to do that might be to use CuPy
To develop an “Atom Mover Distance” between atomic configurations I look to make a FGW between two tensors of atoms position/features, but the shape of the data doesn’t match that of the signature for the FGW function:
signature: (ns, ) (nt, )
ours: (ns, X) and (nt, X) X is xyz mass charge element valence etc etc
Further, we can only move the first 3 axis of the data (position) so we would look for
partial derivative of GW/FGW wrt change in xyz positions of atoms
Finally, we’d like to compare differently shaped molecules, is this GW/FGW capable of unbalanced problems?
How can we tinker with these functions to allow us to calculate a scalable, partial, unbalanced GW/FGW distance between molecules?
Since many interesting systems are ~50,000 atoms++ (antibodies, virus particles) it would be clutch if we integrated Keops library because this would prevent memory overflows on GPU
I could hack on a proof of concept with KeOps but seek your mentorship/advice before I begin