openworm / owmeta

Unified, simple data access python library for data & facts about C. elegans anatomy
MIT License
155 stars 50 forks source link

ALFRED: an evolutionary approach to modeling embryo development #8

Closed StevePMcGrew closed 7 years ago

StevePMcGrew commented 10 years ago

A possible approach to modeling development of the C. elegans embryo

Background: I propose to model embryological development in C. elegans by beginning with a simpler problem - plant development - to create first an algorithmic structure homologous to the genetic program that drives plant development, and then generalize the algorithmic structure to enable it to cover the more complicated development of multicellular animals.

Around 1990, it occurred to me that the operons found in bacterial DNA constitute a universal computer language. This led to my collaboration and friendship with Dick Gordon. For nearly a decade I had developed and worked with a "genetic algorithm" that serves as a general problem solver, finding optimal solutions to complex problems in engineering. The genetic algorithm, "Generator", has been used for optimizing stock portfolios, designing lens systems, cracking encryption, and a host of other optimization tasks. It seemed to me that a computer language designed specifically to reflect the algorithmic structure of operons could be used to evolve algorithms to optimally model complex systems. Only a toy version of "Operon Language" was ever developed.

The current proposal is to use a somewhat higher-level approach, to model embryological development. A handful of elemental instructions, specifically designed to reflect the parallel development of cell lineages in an embryo as well as the physical and chemical properties of cells and their environments, will provide the basis for evolving an algorithm that models the actual development of the embryo as closely as possible using known experimental data.

An important advantage to having an algorithmic model is that the model can be used to identify potentially fertile areas of experimental research. The very nature of the algorithm and the way it emerges will make it able to adapt to accommodate new experimental data or offer alternative explanations for an observed spatiotemporal pattern in development.

ALFRED: proposed Algorithmic Language For Realistic Embryological Development The proposed language, ALFRED, will consist of arbitrary sets of the following instructions:

· Replicate · Die · Update State · Sense · Signal

The instructions will operate on the following objects: · Cells which have attributes of: o pedigree o state buffers representing: § xyz location § orientation § rigidity § size § shape § motiliity § etc.

Mechanical and chemical interactions between cells will be mediated by the Sense function.

There should also be an Overview module that, using cell states, provides information that can only be computed from a global perspective such as overall shape, distribution of stresses and strains in the embryo, and external influences. The Overview module will post information that is accessible to individual cells via the Sense function.

Adjunct to the Overview module will be a Presentation module through which the user can use visualization tools to watch the growth of a virtual embryo.

A computational cycle amounts to:

  1. using the Sense function to gather relevant internal and external information for each cell,
  2. updating cell states,
  3. Dividing or Dying

Division can be symmetrical or asymmetrical. In the symmetrical case, Division amounts to creating a new cell that is identical in all respects to the parent cell, including the contents of each state buffer. In the asymmetrical case, the contents of each state buffer are copied with changes dictated by a Boolean function of internal state values.

Dying removes a cell from the population, and is dictated by a Boolean function of internal and external state values.

Although ALFRED can be used by a human programmer like any other programming language to write code line by line, I propose that a genetic algorithm be employed to evolve code that results in a model of the embryo.

The genetic algorithm will work as follows:

  1. SETUP: 1a. create a population of Np quasi-randomly constructed instruction sets and Sense functions, the combination being called an "individual". 1b. Associate one precursor cell to each individual. The state values in the precursor cell all start at "zero".
  2. CELL CYCLE 2a. Run the individuals in the population through one computational cell cycle for each precursor cell. Unless the cell Dies, it will be replicated to produce a descendent cell. The collection of cells descended from any one precursor cell is called an "embryo".
  3. DEVELOPMENT 3a. Repeat CELL CYCLE, for a user-assigned number Nc of cycles.
  4. SELECTION&VARIATION 4a. After Nc computational cycles, compare the states of the cells in each embryo to observed cell states at the same stage. The degree of matching is called "fitness". The user can specify a function to calculate fitness 4b. Rank the individuals in the population according to their "fitness". 4c. Create a new population by: 4ci. keeping the Nh highest-fitness individuals 4cii. replacing the Nl lowest-fitness individuals with randomly created individuals 4cii. generating enough new individuals to maintain the population at Np, by: 4cii1. selecting a "mother" individual from the current population, with probability proportional to the individual's fitness rank. 4cii2. selecting a "father" individual from the current population, also with probability proportional to the individual's fitness rank. 4cii3. creating a hybrid "daughter" individual from the "mother" and "father" by a user-specified process which will usually retain common features of the mother and father and quasi-randomly copy features from the mother and father that are not common to both. It helps a lot to choose a process that provides a high likelihood that the resulting daughter will have a reasonably high fitness. 4cii4. repeating steps i through iii above until the population reaches Np.
  5. EVOLUTION 5a. Repeat CELL CYCLE, DEVELOPMENT, and SELECTION&VARIATION until a user-specified termination criterion is met. The criterion might be a number of evolution cycles, an amount of computational time, a rate of change of fitness value of the highest-fitness individual in the population over the past number Nx of evolution cycles, or attainment of a target level of fitness in any individual in the population.

The user will specify Nh, Nl, Np, and other control parameters via a Dashboard. The program should save the Nh top-fitness individuals for the user to study.

Discussion: There is, of course, a lot to discuss and a lot of experimenting to do if ALFRED is to be implemented. One part of ALFRED will need concentrated attention: producing daughter individuals from parent individuals. The Overview, Presentation, and Dashboard modules can probably be worked on after the other parts are done.

I propose that ALFRED be used initially to model plant development. It should be relatively easy to evolve algorithms to construct satisfactory models of, for example, ferns, oak trees, pine trees, palm trees, ivy, roses, or carrots. Experience with these simpler examples should provide a basis for modifying ALFRED's structure to tackle C. elegans.

If ALFRED is pursued, the database tools currently under development in DevoWorm will be valuable in several ways: · informing decisions that will need to be made in constructing ALFRED · providing a resource ALFRED will use for calculating fitness while evolving models.

DickGordonCan commented 10 years ago

Dear Steve, I think this is an excellent start. What is missing is the geometry, topology, continuum and finite element mechanics, which are subsumed in the “overview” function. The differentiation tree represents a bifurcating alternation of genetic/gene expression and mechanical events. While these could be part of “sense” and “signal”, how do we specify the direction of sensing and signalling? To put that another way, we need explicit features for initiation, propagation and transmission of differentiation waves (which may just be single cell waves in C. elegans), and perhaps some gene expression modelling such as found in:

Gerdtzen, Z.P., J.C. Salgado, A. Osses, J.A. Asenjo, I. Rapaport & B.A. Andrews (2009). Modeling heterocyst pattern formation in cyanobacteria. BMC Bioinformatics 10(Suppl. 6), S16.

I think the DevoWorm project is a candidate for an NIH grant:

Biophysical and Biomechanical Aspects of Embryonic Development (R01) http://grants1.nih.gov/grants/guide/pa-files/PAR-13-207.html

StevePMcGrew commented 10 years ago

Dick, You're right that the Overview module would subsume geometry, topology, etc. For example the data from which a finite element mechanics model would be built would reside in the state buffers of the cells and in the Overview buffers representing any external influences, and would be interpreted by components of Overview. Anything of a global nature (stresses, chemical gradients, etc.) that an individual cell would need in order to complete a Cell Cycle correctly would be calculated by Overview and then placed in Overview buffers that would be read by the Sense functions of individual cells.

My guess is that we won't need much by way of Overview functions until at least four or five cell cycles have passed.

Re needing explicit features for initiation, propagation and transmission of differentiation waves: There are a lot of mechanisms that can plausibly be responsible for differentiation waves. ALFRED as currently conceived would probably be able to model differentiation waves via Sense, Signal and Update, but would not be able to model sequences of changes that occur within a single cell cycle. However, it should be possible to expand ALFRED fairly easily to allow for any number of sense/signal/update cycles within a cell cycle and thereby accomplish what you need - and to incorporate gene expression models if that proves useful.

DickGordonCan commented 10 years ago

Any comment on getting involved in this group?:

http://co.mbine.org/events/COMBINE_2014

StevePMcGrew commented 10 years ago

We should look into it.

mwatts15 commented 7 years ago

Closing this as there's not much overlap of this proposal with the purpose of PyOpenWorm.