GMOD / Chado

the GMOD database schema
http://gmod.org/wiki/Chado
Artistic License 2.0
38 stars 25 forks source link

Add a table to store environmental data #107

Open guignonv opened 5 years ago

guignonv commented 5 years ago

For people doing plant characterization or breeding, it may be useful to store environmental data (climatic data and other). Those data are related to a specific geo-location and are taken at a given time. Therefore I would tend to use the nd_geolcation table for geo-locations and add a nd_fact table:

CREATE TABLE nd_fact(
    nd_fact_id BIGSERIAL PRIMARY KEY NOT NULL,
    nd_geolocation_id BIGINT NOT NULL REFERENCES nd_geolocation (nd_geolocation_id) ON DELETE CASCADE INITIALLY DEFERRED,
    type_id BIGINT NOT NULL REFERENCES cvterm (cvterm_id) ON DELETE CASCADE INITIALLY DEFERRED,
    timecaptured TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    value TEXT NULL,
    CONSTRAINT nd_fact_c1 UNIQUE (nd_geolocation_id, type_id, timecaptured)
  );
CREATE INDEX nd_fact_idx1 ON nd_fact (nd_geolocation_id);
CREATE INDEX nd_fact_idx2 ON nd_fact (timecaptured);
COMMENT ON TABLE nd_fact IS 'The fact table contains facts (temparture, weather condition,...) at a given time for a given geolocation.';
COMMENT ON COLUMN nd_fact.value IS 'The value can be NULL if the type_id is self-explicit. For instance, if the type_id term is "sunny day", there is no need for a value.';
guignonv commented 5 years ago

By the way, I don't need an "end time" on capture date but it may be useful for others. Should we add something like "timecapturedend TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,"?

scottcain commented 3 years ago

What is the status on this? I mentioned to @guignonv that it would be nice if the PR https://github.com/GMOD/Chado/pull/108 could be reworked to be in migrations rather than changing the sql file directly, but I can do that since it's been awhile since this has been touched. Does anybody have an opinion about the need for a "timecapturedend" field?

laceysanderson commented 3 years ago

I think it's a good idea to add the timecapturedend column as we often store min, max, median temperatures. These could be stored with the timecaptured being the start of the range, the timecapturedend being the end of the range, the type_id pointing to a temperature minimum cvterm and the value being the minimum temperature recorded.

PS. I approve this table addition 👍 and we would work it into our environmental data storage plans!

scottcain commented 3 years ago

I added a commit that adds the timecapturedend field, making it nullable and with a description: This optional value can be used to mark the end of the time that the catured fact data refers to, that is to provide a time span rather than a time point; can be null Since this is a medium on the edge of large change (that is, I don't think there is a similar "fact" table anywhere else in Chado, is there?), I'd like to see a little more feedback about this change before I make the PR. My plan then would be to close @guignonv 's PR and create a new one based on this commit. Thoughts?

bobular commented 3 years ago

We're moving away from Chado for non-technical reasons, but we often thought we wanted to store "interventions" (e.g. insecticide spray campaigns, bednet distribution etc) in Chado and this would have been great.