criosx / molgroups

Composition space modeling for scattering data analysis
MIT License
2 stars 0 forks source link

Lipid species handling improvements #18

Closed hoogerheide closed 2 years ago

hoogerheide commented 3 years ago

Currently arbitrary lipid compositions are not supported; a maximum of 3 species + cholesterol is allowed. Also the user has to manually update the fitting script with all of the acyl / headgroup volumes. Both issues could be handled by allowing the bilayer to be initialized/modified in two ways:

  1. Create a library (dictionary) of headgroups and acyl chains that include molecular volumes and scattering lengths. Allow these as an input to fnInit or similar (perhaps as a list of acyl/headgroup pairs, e.g. [['DO', 'PC'], ['DO', 'PS'])
  2. One would need the option to create a "new" or "arbitrary" lipid species that could be passed in this way. Might require a new class.
  3. In principle an arbitrary number of lipid combinations could be passed in this way. It's a bit tricky then because the number fractions of the various species is then also a list of parameters. I'm not sure if FunctionalLayer can handle this, but it could probably be easily modified to do so. We might need this functionality anyway for the Hermite splines.
hoogerheide commented 3 years ago
  1. In principle an arbitrary number of lipid combinations could be passed in this way. It's a bit tricky then because the number fractions of the various species is then also a list of parameters. I'm not sure if FunctionalLayer can handle this, but it could probably be easily modified to do so. We might need this functionality anyway for the Hermite splines.

Point 3 is addressed by https://github.com/reflectometry/refl1d/pull/132

hoogerheide commented 3 years ago

The periodictable library's "fasta" module has almost all of the functionality that is required for this. https://github.com/pkienzle/periodictable/blob/41cc99ec88a0c664615358d1d3069f0b06da0fd4/periodictable/fasta.py

The "Molecule" object can be instantiated with a formula and a molecular volume ("cell_volume"). Permanent hydrogens are H in the fomulas, permanent deuteriums are D, and labile hydrogens are H[1]. The method Molecule.D2O_sld with the D2O_fraction argument yields the SLD, or Molecule has an Hsld and Dsld property that are precomputed.

This allows a lipid to be specified as a headgroup and a list of tails (allowing anywhere from 1 to N tails, e.g. sphingomyelin to cardiolipin). For the BLM options we can then pass in a list of lipid objects and a list of number fractions.

For cholesterol and cholesterol-like constituents, we can either have a separate list or simply treat them as lipids. In the latter case we could see if the constituent is a Molecule object or a lipid object and deal with them accordingly.

The init and fnAdjustParameters functions would have to be updated to handle lists of arbitrary length, but this should be straightforward.

hoogerheide commented 3 years ago

Here's an example of a Lipid class that associates the hg with combined tails. The formula for oleoyl isn't immediately obvious (in particular: why isn't it 18 carbons as would be naively expected?) so we'll have to think about how to document dealing with the methyl group (currently included in the tail formula and volume and subtracted later) and where the boundary with the headgroup should be (currently the ester carbons and oxygens are included in the headgroup), etc.

from periodictable.fasta import Molecule

pc = Molecule(name='pc', formula='C10 H18 O8 N P', cell_volume=331.00)
oleoyl = Molecule(name='oleoyl', formula='C17 H33', cell_volume=474.90/2.0)

class Lipid(object):
    def __init__(self, name=None, hg=pc, tails=[oleoyl, oleoyl]):
        # hg = Molecule object with headgroup information
        # tails = List of molecule objects containing lipid tail information
        # default is DOPC
        total_volume = sum([t.cell_volume for t in tails])
        if isinstance(tails, list):
            total_formula = ' '.join([str(t.formula) for t in tails])
            self.tails = Molecule(name='tails',formula=total_formula, cell_volume=total_volume)
        elif isinstance(tails, Molecule):
            self.tails = tails
        else:
            TypeError('Lipid.tails must be Molecule or list of Molecules')

        self.hg = hg

dopc = Lipid(hg=pc, tails=[oleoyl, oleoyl])
# calculate headgroup scattering lengths, for example
nSL, nSL2 = dopc.hg.Hsld * dopc.hg.cell_volume * 1e-6, dopc.hg.Dsld * dopc.hg.cell_volume * 1e-6