Plugin for xtb - Githubissues

peastman commented 1 year ago

I've been thinking about a plugin to compute forces with xtb. This would be useful both for the GFN-FF force field and for the GFN2-xTB semiempirical method. It runs entirely on the CPU, so the performance isn't great, but it's fast enough to be useful for small molecules.

Is this something other people would find useful?

RaulPPelaez commented 1 year ago

There seems to be some degree of GPU acceleration: https://xtb-docs.readthedocs.io/en/latest/development.html#building-with-gpu-support Glancing at the repo it looks like it is reduced to turning on cuSolver and cuBLAS, I guess for diagonalization/matmul stuff.

sef43 commented 1 year ago

Would this be for a whole system using xtb or would it be for having a subset of atoms using xtb forces using an appropriate QM/MM embedding?

lvotapka commented 1 year ago

This would be amazing! Along with any sort of QM/MM implementation in OpenMM.

peastman commented 1 year ago

Would this be for a whole system using xtb or would it be for having a subset of atoms using xtb forces using an appropriate QM/MM embedding?

You could do either one. The embedding could be implemented with the same method as OpenMM-ML.

nielskm commented 1 year ago

This would be a great addition! The xtb-python API has the drawback, however, that it is not possible to reuse GFN-FF topologies between single-point calculations (unless they have changed it recently, which I don't expect). In many ways the xtb Python (and C) API does not expose the latest developments (such as the latest implicit-solvation models). But for evaluating e.g. GFN2-xTB energies and forces it should suffice.

Regarding GPU support in xtb the build system is currently broken (https://github.com/grimme-lab/xtb/issues/723).

peastman commented 1 year ago

We would want to use the C API. Going through Python adds unnecessary complexity and overhead. Since xtb is available from conda-forge, I figured we would just link to that version. Whenever they create a build with GPU support, we should get it automatically.

nielskm commented 1 year ago

It makes perfect sense to simply use the C API. By the way @peastman , would it be possible (and make sense) to implement the interface such that atomic charges from the outer region can be passed to xtb as an external charge field and the atomic charges computed by xtb are then subsequently used in the NonBondedForce? This would be a sort of "pseudo-polarizable" treatment of the system. But perhaps there are some drawbacks or limitations of such an approach?

peastman commented 1 year ago

That would be very complicated to do, and I don't think we have enough information to do it. To compute forces, you would need to include a chain rule term involving the derivatives of the computed charges with respect to all the atom positions.

nielskm commented 1 year ago

That would be very complicated to do, and I don't think we have enough information to do it. To compute forces, you would need to include a chain rule term involving the derivatives of the computed charges with respect to all the atom positions.

Point taken! Thank you for the explanation.

raimis commented 1 year ago

@peastman what is the status of the xTB plugin?

peastman commented 1 year ago

I haven't started on it yet. I plan to wait until https://github.com/openmm/openmm-plumed/pull/70 is merged, then base the new plugin on that one.

philipturner commented 1 year ago

GFNn-xTB are "tight binding" density functional theory, a version of ab initio quantum mechanics with a lower pre-factor to $O(n^3)$ scaling. They are classified as "semi-empirical quantum mechanics".

Is this something other people would find useful?

@maaku previously indicated that he sees great potential in GFN-FF, which emerged from the GFN series of ab initio iterative solvers. I did some very thorough performance analysis. I confirmed that GFN-xTB scales $O(n^3)$ and GFN-FF $O(n^2)$ with an insanely large pre-factor. GFN-FF uses a sparse matrix factorizer that seems very difficult, and time-consuming, to port to highly parallel GPU architectures.

Whether this can be used practically for anything except super-small systems (a few hundred atoms) is questionable. Here are some documents comparing to MM4, which is a regular molecular mechanics forcefield with ~1/3 the speed of AMBER (at the time, with the benchmarked implementation). In the time GFN-FF takes to simulate ~500-1000 atoms, an $O(n)$ standard forcefield can simulate ~70,000 atoms. I extrapolated some CPU benchmarks to GPU, assuming the GPU has the exact same ALU utilization as the CPU, using the ratio of TFLOPS. This is a liberal estimate that overestimates GPU performance of GFN-FF, which I caution may be difficult to parallelize.

Today I was investigating the xTB CPU package as something to incorporate into my own software. It's nowhere near feasible to use for reasonably sized MD simulations (e.g. one 10,000 atom nanomechanical part), but it could be used to check validity of less accurate $O(n)$ forcefields, such as MM4. Or, it could be a stand-in until the cheaper forcefields are parameterized with MP2 or CCSD(T), for the elements you wish to simulate.

GFN Speed - Sheet1.pdf

Force Field Comparison - Sheet1.pdf

peastman commented 1 year ago

The main use of it would be for small molecules. Most drug molecules are under 100 atoms. It could also be used in QM/MM simulations where you simulate a small piece with QM and the rest with a standard force field.

philipturner commented 1 year ago

There might be a use case for myself. The tip of the AFM or 2nd generation manufacturing device usually has some strange elements like Ge, Sn, transition metals, etc. These rarely exist in common forcefields, or have poor parameters. However, the majority of atoms would be bulk diamond, with doping from Si or N. Those atoms could be simulated with MM4. But to get reasonable performance, we'd need a custom GPU implementation without any dependency on the CPU.

philipturner commented 1 year ago

I wouldn't mind pitching in to help with some exploratory research. e.g. pinpointing the computational bottlenecks that are/aren't easily parallelizable. The elephant in the room is the solution to a sparse system of linear equations (EEQ electrostatics).

peastman commented 1 year ago

I have an initial version of this working: https://github.com/openmm/openmm-xtb. It requires OpenMM 8.1.

philipturner commented 1 year ago

I understand correctly, it's not yet GPU accelerated. There's still a need for the CPU (xTB) and GPU to synchronize every timestep. But something that can fit into workflows using OpenMM.

peastman commented 1 year ago

Correct. Hopefully they'll eventually add GPU acceleration. For now, the cost of synchronization is insignificant. Computing the forces takes 1000x longer than communicating them.

Here's a performance optimization you can try. XTB is multithreaded, but in every case I've tested, that makes it slower rather than faster. You can disable it with

export OMP_NUM_THREADS=1

You'll likely find that makes it several times faster.

philipturner commented 1 year ago

This should still make xTB easier to use. You don't have to mess with the command-line interface or file format for storing parameters generated for GFN-FF. The biggest barrier to entry was investing time learning how to use their API. Now I might be able to access it from my OpenMM-centric simulator framework.

philipturner commented 1 year ago

Is there a way to autogenerate a C API for this plugin? So I can include it in https://github.com/philipturner/swift-openmm?

peastman commented 1 year ago

Writing it by hand would be much simpler. The same way we autogenerate the SWIG interface files for OpenMM, but just use a handwritten one for the plugin. It's only a single class.

peastman commented 1 year ago

Since we now have a working first version of the plugin, I'm going to close this issue. All further discussion about it can happen at https://github.com/openmm/openmm-xtb.

openmm / openmm

Plugin for xtb #4228