Chemellia / ChemistryFeaturization.jl

Interface package for featurizing atomic structures
https://chemistryfeaturization.chemellia.org/dev/
MIT License
41 stars 14 forks source link

Autodiff the graph building for "forces" #10

Open rkurchin opened 3 years ago

rkurchin commented 3 years ago

Came up in a group meeting discussion. Could be a neat idea to try to differentiate with respect to graph weights and see if you can get something like a force by propagating that through a pretrained model...

DhairyaLGandhi commented 3 years ago

What would we need for that? Which functions would need differentiating?

rkurchin commented 3 years ago

I haven't though it out in a ton of detail yet, but loosely here's the concept:

Interatomic forces come from derivatives of potential energy with respect to atomic separation. Those same atomic separations are what we use to build the adjacency matrices of the graphs (edge weights set by some decaying function of the distance). So if we had a model that took in a graph and outputted a total energy and you could differentiate all the way through, in principle you could get forces (whether they'd be any good without forces as part of the training is a separate, though interesting, question).

To do this, the main thing we'd need would be to differentiate through the different weight computation functions (weights_cutoff and weights_voronoi here), in particular with respect to the dists argument.

Then, those adjacency matrices go into building the graph laplacians, which influence how the convolutional layers work, but since those are already inside Flux, I assume that differentiation shouldn't be too hard? Though the laplacians aren't trainable parameters, so IDK if that throws any kind of wrench in the works syntactically.

If you'd be interested in diving into this in more detail, I think it could be really cool, and potentially super impactful if it turns out to work well! We'd have to figure out what a good first test case would be (presumably some snapshots from DFT calculation relaxation trajectories), but @venkvis might have some good ideas...

DhairyaLGandhi commented 3 years ago

Absolutely! Sounds like a good idea. I would imagine that the forces calculated would still need to be part of a learning scheme with some trainable parameters.

rkurchin commented 3 years ago

Have been staring at this and thinking for a bit before I make my new branch too messy...I think the shortest set of steps to get a prototype of this working is:

  1. "break out" function to go from filepath => Xtals.jl representation (instead of one big function that reads in file and completely builds AtomGraph, as exists currently)
  2. write another function to build AtomGraph from that
  3. chain fcn from (2) into a model, Dhairya does whatever magic is needed (hopefully none?) to differentiate that with respect to atom positions that are the output of the function from (1)

Bigger questions include how to actually integrate this stuff into the API of ChemistryFeaturization as it currently exists? There are questions about incorporating the kind of "backend" architecture we were talking about that come into this, but also things that could be really helped along by the abstract interface discussions that have come out of the BoF session. Hopefully we'll have a basic prototype of that by the end of the month...for now, I think doing this fast and somewhat hacky is probably okay just to get a proof of concept...

thazhemadam commented 3 years ago
  1. chain fcn from (2) into a model, Dhairya does whatever magic is needed (hopefully none?) to differentiate that with respect to atom positions that are the output of the function from (1)

Wouldn't this be required all the same, regardless of (1) and (2), since (2) should ideally produce the same outputs and AtomGraphs as it does now?

There are questions about incorporating the kind of "backend" architecture we were talking about that come into this, but also things that could be really helped along by the abstract interface discussions that have come out of the BoF session

I think before actually building a hacky proof of concept, it might be a good idea to take a step back and try to ascertain the role ChemistryFeaturization.jl would have with respect to other packages, and in the ecosystem as a whole.

A few questions that come to mind which I think may be worth thinking about in this context are -

Dhairya mentioned ImageIO, ImageMagick and FileIO as an example. In this example, it feels clear that the backends were written with the purpose of them being backends in the first place, whereas I don't think that's the case with ChemistryFeaturization.jl and Xtals.jl. We could try and standardize interfaces between the two packages as necessary so we could have a "seamless" backend, but wouldn't other backends (which I presume would handle different types of files) also require similar interfaces?

I could also possibly be misunderstanding what either of you has in mind when you say backend, and if I am, please correct me. 😅

rkurchin commented 3 years ago

Trying to answer more or less in order...

  1. chain fcn from (2) into a model, Dhairya does whatever magic is needed (hopefully none?) to differentiate that with respect to atom positions that are the output of the function from (1)

Wouldn't this be required all the same, regardless of (1) and (2), since (2) should ideally produce the same outputs and AtomGraphs as it does now?

Yes? I'm not sure exactly what the question is, as it feels like you're asking me whether I think I need to do the thing that I said I needed to do. 😛

  • What would the "hierarchy" of the MolSim ecosystem roughly look like, and where would ChemistryFeaturization fit in?
  • Is ChemistryFeaturization going to have a more central role, like a one-stop shop of sorts, for every type of chemistry/molecular science-related featurization?
  • If not, what specific domain do we operate in and what comes under our purview?

These are exactly the right kinds of questions. I would hope it would (eventually) be a one-stop shop as you say, but the "specific domain" would be featurization for ML models, and obviously ML is only one part of the broader ecosystem, which will also have a big focus on simulations like DFT and MD, for which the notion of "featurization" is not really needed...or at least not in the same way, there are obviously choices to make regarding basis sets, energy cutoffs and other parameters, (pseudo)potentials, etc. I think these things are well outside the domain of what I'm hoping to do with ChemistryFeaturization and Chemellia, at least right now.

My hope would be that the abstract types we define are fairly universally applicable, and then we either purpose-build or perhaps even use a more generally applicable concrete type for our own structure representations that we featurize for ML.

As for the backend question, I don't necessarily have a clear answer of how it should look, but I'll share a few thoughts:

Yet another somewhat separate question (but related to the abstract structure API stuff) is one of data structure for this stuff going forward. I'm thinking more more that it probably makes sense to (at least have the option to) attach full structural information to any ML representation (e.g. an AtomGraph). This would allow compatibility with things like this autodiff scheme, but also featurization by rdkit functions, provided we can convert our own structural representation into an rdkit one relatively easily. Because it seems the other option for these kinds of featurizations would be to "start from scratch" (i.e. from structure file), which for just a SMILES string or something may not be a big deal, but seems rather inelegant as a longterm practice.

Tagging @cortner in case he has any thoughts on this, as some of this is relevant both to our conversation just now, as well as the ongoing ones about structure representations. No pressure to read all this, Christoph, just in case you're interested!

cortner commented 3 years ago

Is ChemistryFeaturization going to have a more central role, like a one-stop shop of sorts, for every type of chemistry/molecular science-related featurization?

I would like that. I'm having a discussion about it with my student tomorrow. His first reaction was he'd like to just start building the model but eventually put ChemistryFeaturization on top of it (and independently maybe supply our model as a layer for your ecosystem as well...)

Re differentiation wr.t. positions

Once you have forces you can do force matching, which will vastly improve your fit accuracy and in particular generalisation. It is much harder to overfit forces than energies. (Some researchers do just force-mathcing and just forget about the energy altogether... "force domain learning" or something like that ....)

DhairyaLGandhi commented 3 years ago

Made some progress with the graph building functions with some handy adjoints. The gradients are incorrect yet, but I have to do a little bit of math to correct that.

julia> gradient(collect(1:10), collect(1:10), rand(10)) do i, j, dist
         sum(GraphBuilding.weights_cutoff(i, j, dist))
       end
([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [0.21753773508968766, 0.2952598053193285, 0.28951826203415876, 0.7383348139259314, 0.7245524645420596, 0.7866558412828093, 0.6683988349998842, 0.6922819179751303, 0.3047111113015635, 0.19151690048933245])
thazhemadam commented 3 years ago
  1. chain fcn from (2) into a model, Dhairya does whatever magic is needed (hopefully none?) to differentiate that with respect to atom positions that are the output of the function from (1)

Wouldn't this be required all the same, regardless of (1) and (2), since (2) should ideally produce the same outputs and AtomGraphs as it does now?

Yes? I'm not sure exactly what the question is, as it feels like you're asking me whether I think I need to do the thing that I said I needed to do. stuck_out_tongue

What I meant was, (1), (2), (3) need not necessarily be the order in which we do these things, as (3) doesn't really need to worry about how (1) and (2) are implemented because whatever is passed for performing (3) remains the same even after regardless.

rkurchin commented 3 years ago

Ah, I see. I suppose that's true, but in practice I'll do them in that order because currently (1) and (2) are done in one function (though without using Xtals.jl) so that general logic is all there already and just needs to be modularized further.