Closed mrship closed 11 years ago
Right, to clarify
Strictly speaking a Gee fu instance should have one Organism, but in this branch we are experimenting a bit so that is relieved. As such there is no existing Genome :belongs_to Organism, there should be, I guess.
An organism can (should) have many Genomes, the genome has many References (a linear DNA string) each reference has one actual Sequence (the actual DNA text, strictly speaking this needn't be in a separate model as sequence is an attribute of Reference but I had a technical problem with the sheer size of the sequences I was trying to use and my RDBMS).
A Reference has many features.
Features belong_to a reference (originally called an assembly but changed which is why belongs_to assembly is still lurking). The reference is the DNA sequence they exist on. Features also belong to an Experiment, experiments in reality are, um, lab experiments that create the gene features. Experiments belong_to a Genome (and thus may include many references and their sequence).
A Feature has many parents (which are other features) using the feature_parents table A Feature has many Predecessors (which are out of date Features)
And I think that covers the data model...
@danmaclean This PR adds PaperTrail to Organisms, Genomes, Experiments and Features as a start. From what I can tell this will be what you need from an audit perspective.
However, I'm not 100% au fait with the data model at the moment. The diagram in manual.pdf doesn't appear to be accurate (for example an Experiment appears to belong to a Genome directly, not via a Reference). If you can outline how the data model works I can review further to see whether we need to extend PaperTrail to Reference, Sequence, Predecessor, Parent.