Soft LJ potentials in shoebox simulations

ezingrebe commented 3 months ago

According to the README.md, one can define and set up the various "soft LJ potentials" in a file called LJsoft_RSQ.table using the script setup_ljsoft_potentials.py. User should be able to modify the following parameters inside the script to get the desired potentials:

lam=0.2 (scaling factor which relates LJ epsilon with epsilon_0)
n=1 (exponential factor)
alpha=0.5 (LJ scaling factor)

The following issues need to be clarified: -the mathematical formula used to define LJ potentials inside the script, with alpha, lam and n factors and their reference to the corresponding (distance) variables. The classical LJ formula, which is V(r)=4epsilon[(A/r)^12-(B/r)^6] doesn't seem to apply here

-units used to define variables and scaling factors (SI units; and if not provided, then the conversion procedure between chromatin monomer/python time step etc. to classical SI units would be necessary). Additionally, the units of energy and force of LJ soft potential (in the file "LJsoft_RSQ.table") need to be defined

lhilbert commented 1 month ago

I will add information here in several comments as I proceed with my search.

The first point, what is the motivation of soft LJ potential vs. canonical LJ potential? The soft LJ potential is more appropriate as a formulation for chromatin, as it does not have the typical extreme repulsion from a hard atomic core. This also takes care of the practical problem of divergence of repulsion at very close distance, which leads to extreme displacements during the discretized simulation steps. One important part is truncation when the distance falls below a cut-off, and other parts might be also modifications to exponents etc.

lhilbert commented 1 month ago

In the notes provided by the original programmer, the precise potential that is used is names as the Weeks-Chandler-Andersen potential. This fits with the general requirements stated in my above comment.

lhilbert commented 1 month ago

We can also go by the preprint from the original programmer, linked under https://www.biorxiv.org/content/10.1101/2022.02.16.480760v1.full

Here, the relevant references are 37 and 38, which served as templates for the construction of our model, as far as I know.

A Buckle, CA Brackley, S Boyle, D Marenduzzo, N Gilbert, Polymer simulations of heteromorphic chromatin predict the 3D folding of complex genomic loci. Mol. Cell 72, 786–797 (2018)
CA Brackley, S Taylor, A Papantonis, PR Cook, D Marenduzzo, Nonspecific bridging-induced attraction drives clustering of DNA-binding proteins and genome organization. Proc. Natl. Acad. Sci. USA 110, E3605–E3611 (2013).

lhilbert commented 1 month ago

In our case, every chromatin monomer is set to represent 10 kb sequence length. As far as I understand, the translation into physical length is done by setting the sigma of the LJ potential to 60 nm. I think this means that the particle has an effective size of 60 nm?! Then this should give one of the physical parameters, namely the length scale.

Actually, exactly this is what reference 37 states: "The diameter of the chromatin beads is a natural length scale with which to parametrize the system; we denote this sigma, and use this to define all other length scales."

lhilbert commented 1 month ago

The reference 37 also gives a very detailed account of how the model was constructed there. It includes both the WCA potential and also the introduction of FENE bonds. Maybe you can go here and cross-check this against the formulation of the potentials in the code. This is in page e7 of the PDF linked at this URL:

https://www.cell.com/molecular-cell/pdfExtended/S1097-2765(18)30787-1

This reference also contains concrete values for the model parameters, I recommend going here to check for the actual values.

lhilbert commented 1 month ago

Next, looking into setup_ljsoft_potentials.py seems to make completely clear what potentials are used.

The file creates tabulation of the potentials and forces for all potential interaction types, on lines 179-338. All these tabulation commands are implemented using exclusively the function _writesection. This function uses the force and energy definitions stored in the functions _computeforce and _computeenergy. These, in turn, can be directly inspected, giving us the potentials that are actually used.

What I find is the following for the force and the energy

Force:

if r<rc:
        x = a*(1-l)**2 + (r/s)**6
        F = 24*(l**n)*eps*(r**5)/(s**6)*(2/(x**3) - 1/(x**2))
    else:
        F = 0

Energy:

xc = a*(1-l)**2 + (rc/s)**6
    Ec = (l**n)*4*eps*(1/(xc**2) - 1/xc)
    if r<rc:
        x = a*(1-l)**2 + (r/s)**6
        E = (l**n)*4*eps*(1/(x**2) - 1/x)
        E = E - Ec
    else:
        E = 0

lhilbert commented 1 month ago

From some comparison of the expressions above and the expression in reference 37, equation (3), my take-away is that the potential used in the code is a modified version of the following general expression:

E=4k_BT[1/x2-1/x], with x=(r_ij/\sigma)6 and \sigma=d_ij=(d_i+d_j)/2

The adjustments that seem to be in place are the following:

softening, by adding an offset to x in the denominators
cutoff for r>r_c, so that F(r)=0 for r>r_c
offset of E by E_c=E(r_c), so that E(r=r_c)=0
Some scaling back and forth to make things fit together, especially on this point we can probably tidy up a bit in the code, and should write down proper mathematical expressions for good practice

lhilbert / EnhancerShoeBox

Soft LJ potentials in shoebox simulations #1