GMOD / Chado

the GMOD database schema
http://gmod.org/wiki/Chado
Artistic License 2.0
36 stars 25 forks source link

project_phenotype table? #62

Open ekcannon opened 6 years ago

ekcannon commented 6 years ago

I would like to attach a set of phenotype records to the study (project) that generated them, but don't want to use the nd_experiment table:

project --- nd_experiment_project --- nd_experiment --- nd_experiment_phenotype --- phenotype

This is because it is challenging to maintain data integrity due to both the length of the connecting chain, and the lack of constraints on the nd_experiment table.

CREATE TABLE project_phenotype ( project_phenotype_id BITINT SERIAL NOT NULL, PRIMARY KEY (project_phenotype_id), project_id BIGINT NOT NULL, FOREIGN KEY (project_id) REFERENCES project (project_id) ON DELETE CASCADE INITIALLY DEFERRED, phenotype_id BIGINT NOT NULL FOREIGN KEY (phenotype_id) REFERENCES phenotype (phenotype_id) ON DELETE CASCADE INITIALLY DEFERRED, CONSTRAINT project_phenotype_c1 UNIQUE (project_id, phenotype_id) ); CREATE INDEX project_phenotype_idx1 ON project_phenotype (project_id); CREATE INDEX project_phenotype_idx2 ON project_phenotype (project_id);

scottcain commented 5 years ago

This seems fine with me, except that I think project_phenotype_idx2 is probably supposed to be on phenotype_id.

laceysanderson commented 5 years ago

Alternatively, we could add a project_id to the phenotype table which is nullable. This would greatly improve queries such as "all phenotypes from a given project" or "all traits measured in a given project", be backwards compatible and still very chado-esque (in my opinion).

The only downside I can think of is that it limits us to a single project per phenotype. However, we can always use dbxref as an example with both a phenotype.project_id and a phenotype_project table.

Full Disclosure: I'm invested in a phenotype.project_id since I already made such a modification for my analyzed phenotypes Tripal module due to serious performance issues observed with the phenotype_project table approach.

ekcannon commented 5 years ago

I'm okay with adding a project_id field to the phenotype table. As Lacey suggests, perhaps the project_phenotype table could be added too, in the unlikely event that a phenotype was generated by more than one project.

bradfordcondon commented 5 years ago

as someone trying to cruise through issues: is there a resolution/consensus on this?

laceysanderson commented 5 years ago

Summary:

@scottcain do you support adding a phenotype.project_id or only the approach of adding a phenotype_project linker table?