Partial Derivatives of Higher Dimensional Unbalanced GW/FGW

PythonOT / POT

POT : Python Optimal Transport

https://PythonOT.github.io/

MIT License

2.42k stars 499 forks source link

Partial Derivatives of Higher Dimensional Unbalanced GW/FGW #152

Open bionicles opened 4 years ago

bionicles commented 4 years ago

To develop an “Atom Mover Distance” between atomic configurations I look to make a FGW between two tensors of atoms position/features, but the shape of the data doesn’t match that of the signature for the FGW function:

signature: (ns, ) (nt, )

ours: (ns, X) and (nt, X) X is xyz mass charge element valence etc etc

Further, we can only move the first 3 axis of the data (position) so we would look for

partial derivative of GW/FGW wrt change in xyz positions of atoms

Finally, we’d like to compare differently shaped molecules, is this GW/FGW capable of unbalanced problems?

How can we tinker with these functions to allow us to calculate a scalable, partial, unbalanced GW/FGW distance between molecules?

Since many interesting systems are ~50,000 atoms++ (antibodies, virus particles) it would be clutch if we integrated Keops library because this would prevent memory overflows on GPU

I could hack on a proof of concept with KeOps but seek your mentorship/advice before I begin

ncourty commented 4 years ago

hi @bionicles , regarding the FGW function, the first parameters M, C1, C2 refer respectively to distances between attributes of your 'atom' (assuming X ?), and inner distances in the two molecules graphs. ns and nt are the probability weights associated to the 'atoms' (I guess that without prior knowledge about this, a uniform distribution could be used).

Regarding the second part of your message, I guess it is less related to POT and more to open research issues. We, as a research team, could be interested by this kind of application. We have some codes (not in POT) running with Keops, gradient descent over partial GW/FGW and fast approximations of GW/FGW under tests right now. Feel free to send me a private email so we can eventually discuss how we can assist you with your problem.

bionicles commented 4 years ago

right, M can be feature distance, C1, C2 could be pairwise distances between atoms

What I'm wondering about is, can we maintain a last dimension of the data?

instead of (ns, nt) (ns, ns) (nt, nt) I wonder if it's better to use (ns, nt, 8) (ns, ns, 8) (nt, nt, 8) to preserve information about specific components of "Atom vectors"

anyway here's a stab at the keops, but I'm not sure how to initialize and optimize pi:

inputs: a = vi(0, 8) b = vi(1, 8)

optimize: pi = ??? soft assignment matrix (a.shape[0], b.shape[0])

fgw = lambda a, b: sum( (a[i] | b[k]) + ( (a[i] | a[j]) | (b[k] | b[l] ) pi[i, k] pi[j, l] )

bionicles commented 4 years ago

The main issue for our use case we run out of GPU memory. KeOps exists to solve that problem by only doing part of the problem on the GPU at a time and adding up the results (blockwise) but KeOps doesn’t work for FGW in a straightforward way because it doesn’t have a way to do Vij indexing of the pi soft assignment matrix between atoms in two different molecules

If you add block wise GPU calculations here, with NVIDIA unified memory sharing between CPU/GPU, then we could scale POT to arbitrary transport problems

One way to do that might be to use CuPy