openworm / OpenWorm

Repository for the main Dockerfile with the OpenWorm software stack and project-wide issues
http://openworm.org
MIT License
2.66k stars 207 forks source link

Add cell lineage info to PyOpenWorm #179

Closed slarson closed 10 years ago

slarson commented 10 years ago

Cell lineage refers to the tree of cell divisions that occurs when an organism is growing from an embryo. We'd like to have a data structure that captures this for c. elegans. The data are out there -- now we'd like to put them into PyOpenWorm.

This is the first step in a greater project to be able to render lineage trees as "differentiation trees" for exploration and potentially simulation down the road.

slarson commented 10 years ago

First proposed data source: Wormbase cell ontology. The 'daughter of' relationship in this ontology has the main relationship we want to pull into our database. This should be loadable into Python's RDFLib to parse.

DickGordonCan commented 10 years ago

I keep an enormous personal bibliography in EndNote (>260,000 references), and could produce a subset EndNote bibliography related to our project that would be kept in a Dropbox folder. This would automatically include PDFs. Of course, to read it, each of you would have to own a copy of EndNote. There are ways to put such online, but public access creates copyright problems. At any rate, here are the related references so far, including some you guys sent: Chavoya, A. (2009). Artificial development. In: Foundations of Computational Intelligence Volume 1: Learning Approximation. Ed.: A.E. Hassanien, A. Abraham, A.V. Vasilakos & W. Pedrycz. 201: 185-215. de Bono, B., P. Grenon & S.J. Sammut (2012). ApiNATOMY: A novel toolkit for visualizing multiscale anatomy schematics with phenotype-related information. Hum. Mutat. 33(5), 837-848. Geard, N. & J. Wiles (2005). A gene network model for developing cell lineages. Artificial Life 11(3), 249-267. Gordon, R. (1999). The Hierarchical Genome and Differentiation Waves: Novel Unification of Development, Genetics and Evolution. Singapore & London, World Scientific & Imperial College Press. Hamahashi, S. & H. Kitano (1999). Parameter optimization in hierarchical structures. Lecture Notes in Artificial Intelligence 1674, 467-471. Harel, D. (2003). A grand challenge for computing: Towards full reactive modeling of a multi-cellular animal. Bulletin of the European Association for Theoretical Computer Science (Bull. EATCS) 81 (2003), 81, 226-235. Harel, D. (2004). A grand challenge for computing: Towards full reactive modeling of a multi-cellular animal. Lecture Notes in Computer Science 2937, 323-324. Hill, D.P., T.Z. Berardini, D.G. Howe & K.M. Van Auken (2010). Representing ontogeny through ontology: a developmental biologist's guide to the Gene Ontology. Molecular Reproduction and Development 77(4), 314-329. Kam, N., H. Kugler, R. Marelly, L. Appleby, J. Fisher, A. Pnueli, D. Harel, M.J. Stern & E.J.A. Hubbard (2008). A scenario-based approach to modeling development: A prototype model of C. elegans vulval fate specification. Developmental Biology 323(1), 1-5. Kitano, H. (2000). Perspectives on systems biology. New Gener. Comput. 18(3), 199-216. Kitano, H., S. Hamahashi & S. Luke (1998). The Perfect C. elegans project: an initial report. Artificial Life 4(2), 141-156. Lee, R.Y. & P.W. Sternberg (2003). Building a cell and anatomy ontology of Caenorhabditis elegans. Comp Funct Genomics 4(1), 121-126. Rose, L.S. & K.J. Kemphues (1998). Early patterning of the C. elegans embryo. Annu Rev Genet 32, 521-545. Van Auken, K., P. Fey, T.Z. Berardini, R. Dodson, L. Cooper, D.H. Li, J. Chan, Y.L. Li, S. Basu, H.M. Muller, R. Chisholm, E. Huala, P.W. Sternberg & C. WormBase (2012). Text mining in the biocuration workflow: applications for literature curation at WormBase, dictyBase and TAIR. Database-the Journal of Biological Databases and Curation 2012, doi:10.1093/database/bas1040. I have PDFs of all of them. Suggestions? Thanks. Yours, -Dick Gordon DickGordonCan@gmail.com

balicea commented 10 years ago

Hello all,

I have finally gotten the chance to start working on this project, after I got a few other projects under control. I am pleased to report a bot of progress. Listed below are the steps I plan to follow as the project progresses.

I am not sure how long each step will take, since I have several other projects going as well. I'll try and keep up. Step 0 (the project white paper, version 1) is now available at the link below. Dick requested this as a way to communicate with prospective interested parties, as well as a "catchy" title (DevoWorm).

I have also been sorting through the available data (Step 1), which I will likely be meeting with Stephen about in the near future. I will keep you posted on the latest developments. If you have questions, please let me know.

Step 0: White Paper for project DevoWorm "From Differentiation Waves to “Wriggling”: new views on C. elegans development"

View on Google Docs: https://drive.google.com/file/d/0B7RsqJbITXXAZHpWQU1hbHhBRFk/edit?usp=sharing

Step 1: Extraction of Data from WormBase.

Step 2: Import data into RDFlib and create data structure.

Step 3: Map data structure to graph (Network X).

MichaelCurrie commented 10 years ago

Does this mean we hope to have cell division be a part of the first running OpenWorm simulation? I would have expected this would be farther down the line due to the complexity.

In any case, I enjoyed reading the White Paper discussing the approach, and the data model.

balicea commented 10 years ago

Michael,

  Thanks for your kind words and interest. Right now, we are just in 

the planning stages. The idea is to use semantic data that gives us clues to how cells divide in development (an aside from the main OpenWorm application). The Sulston reference in the white paper laid out what is expected for such a system (since developmental cell division is deterministic in /C. elegans/, this is hard but not insurmountable). The advantage of what we want to do is to show how cell division occurs due to theoretical expectation. I'm not sure how that will pan out in the context of the OpenWorm simulation, but we want to at least have a crude model of cell division to model developmental hypotheses with. Best,

Bradly Alicea

On 5/1/2014 12:45 PM, Michael Currie wrote:

Does this mean we hope to have cell division be a part of the first running OpenWorm simulation? I would have expected this would be farther down the line due to the complexity.

In any case, I enjoyed reading the White Paper discussing the approach, and the data model.

— Reply to this email directly or view it on GitHub https://github.com/openworm/OpenWorm/issues/179#issuecomment-41934885.

slarson commented 10 years ago

@balicea Great start! Just getting to have a look at this now.

@MichaelCurrie This effort is being driven by some folks that are motivated to work on the development side under the auspices of OpenWorm. It is definitely a planning stage and for now is disconnected from the mainline simulation. However several of the same organizational aspects and data source resources can be leveraged. It is an exciting new beginning!

slarson commented 10 years ago

Also I wanted to link to the notes from the last discussion:

https://docs.google.com/document/d/17ikM7lq8BmhO_2wkR6mj3ziTSkDXUiAvnoR9mGE_FqU/edit

slarson commented 10 years ago

@mwatts15 you should have a look at this as well

slarson commented 10 years ago

@balicea the paper is a great start! A couple things:

  1. You've forked the main OpenWorm repo, but you'll be better off forking https://github.com/openworm/PyOpenWorm as it already is using RDF and NetworkX
  2. @mwatts15 is a Google Summer of Code student who is helping us build a unified data model for OpenWorm. Check out his proposal for this here: https://groups.google.com/d/msg/openworm-discuss/VBT0CivmAAs/gIDJb-83jgEJ I'm thinking that we should be working in concert on this because they are very closely related and the data types you want to work with fit well with this.
  3. I'm thinking that we should add to the white paper the strategy for how to add these data into PyOpenWorm at the level of what new classes and methods will need to be created, and more details on what data transformations we need to make. Having @mwatts15 along for this would be great.
MichaelCurrie commented 10 years ago

Great! Imagine the simulation as starting on your browser as a picture of a solitary worm egg, and then it goes pop, pop, pop, as the cells divide, and then you are left with the full L1-stage worm on your screen ready to go.

balicea commented 10 years ago

At a superficial level, that's essentially what we are trying to capture. However, are also trying to incorporate the theoretical idea of differentiation waves. So it's going to be a bit more complicated than capturing the process of developmental cell division, and also gives global information (how do regions of cells get organized in development) as well.

slarson commented 10 years ago

Moved to: https://github.com/openworm/PyOpenWorm/issues/7