pacificclimate / modelmeta

An ORM representation of the model metadata database
GNU General Public License v3.0
1 stars 0 forks source link

datasets are assigned wrong time_set during indexing #122

Open corviday opened 10 months ago

corviday commented 10 months ago

find_timeset checks whether a time_setthat matches data being indexed is already in the database by comparing time resolution, calendar, and start and end dates.

However, two datasets may have the same time resolution. calendar, and start and end dates but have different timestamps and therefore different time_sets - if one of them is wrong.

Recently, a dataset with incorrect timestamps was indexed, and a time_set for annual 1971-2000 was created. Later a 1971-2000 annual dataset with correct timestamps was indexed, and assigned to the time_set with incorrect timestamps. Possibly find_timeset could check individual timestamps in addition to start and end date and resolution. Though "don't index bad data" is a good solution too.

corviday commented 9 months ago

The recent occurrence of this issue was resolved by creating a new time_set and re-indexing the data, but it might still be worthwhile to have the indexing script warn you if your data contains nonstandard timestamps, or perhaps implement the deeper check described above.