Open evz opened 10 years ago
@evz That's a good point. I separated DATE and TIMESTAMP in order to handle diverse datasets but we can revisit the integration scheme.
On Thu, Jan 9, 2014 at 10:36 AM, Eric van Zanten notifications@github.comwrote:
@svetlozarn https://github.com/svetlozarn I'm curious why you're storing a NULL in the obs_ts field for the crime data (herehttps://github.com/datamade/wopr-data/blob/master/init/init_chicago-crimes-all.sql#L245). I realize you're storing that Orig_Date info into the obs_date field but since the schema declares that as a DATE column, it's not saving the time portion of that info. Is it safe to just save the Orig_Date into the obs_ts field as well? Or were you thinking of using that in a different way?
— Reply to this email directly or view it on GitHubhttps://github.com/datamade/wopr-data/issues/5 .
I made a new branch here that basically takes the same value that gets inserted for the obs-date
field and inserts it into the obs_ts
field for the crime data (since that's the only one where we have greater than a day resolution)
@svetlozarn I actually figured a way to run a temporal aggregate query in a pretty flexible way using the postgresql date_trunc
function (more on that over here). It looks like it should support all of the temporal resolutions that we're after by doing something like this:
SELECT
date_trunc('month', obs_date) as time_agg,
dataset_name, count(*)
FROM dat_master
WHERE obs_date > '2013-08-01'
GROUP BY time_agg, dataset_name
ORDER BY time_agg;
Replace the month
part in there with anything that the date_trunc
function supports (day
, year
, minute
, second
, etc) that we have instant temporal aggregation. Does that sound sensible? One thing that this kind of approach would rely upon is having the date/time info in a consistent field and not in two different fields.
@svetlozarn I'm curious why you're storing a
NULL
in theobs_ts
field for the crime data (here). I realize you're storing thatOrig_Date
info into theobs_date
field but since the schema declares that as aDATE
column, it's not saving the time portion of that info. Is it safe to just save theOrig_Date
into theobs_ts
field as well? Or were you thinking of using that in a different way?