Create better time chart example

HeikoH commented 9 years ago

The current example doesn't make much sense. You can't connect single observational data points with lines. Doing so suggests you can interpolate between two points, which you can't. A proper timeline chart would need to bin observations per year, something like this:

ajturner commented 9 years ago

@HeikoH thanks. can you point me to a Layer that has an exemplar set of temporal line data?

Or an array of values :)

phlorah commented 9 years ago

@HeikoH is right. Line charts will be binned by date intervals (days, weeks, years...) depending on the scale and range of the data. They are essentially histograms of temporal data in which we summarize (count, sum, etc...) the features in each bin (date range).

phlorah commented 9 years ago

We need to provide a default for the date range bins, but the user needs to be able to change binning to a different scale to see more or less detail in the time series pattern.

HeikoH commented 9 years ago

@ajturner see https://github.com/HeikoH/cedar/commit/e12ca3c94e7804efb4b5ee6145c6778df5ea1287, which will result in this chart:

The challenge is to construct the data... But it will be similar as how histogram data is created.

ajturner commented 9 years ago

That's good @HeikoH - do you want to PR it?

Regarding the binning - this is again where Vega could do the work. There are built in "aggregate" methods the developer could optionally use if you had feature level data.

HeikoH commented 9 years ago

@ajturner no, don't really want to create a PR because I believe a solution should also be provided to allow using a FL url. Otherwise what's the point of the sample, right?

HeikoH commented 9 years ago

@ajturner was trying to do something with Vega data transforms to transform the data, no luck (yet)...

HeikoH commented 9 years ago

@phlorah do you have some specific rules in mind as how to determine the bin size?

phlorah commented 9 years ago

Time is a little trickier because we want to split the bins into meaningful time units. I will send you what the spatial stats team uses to bin time for Create Space Time Cube.

ajturner commented 9 years ago

@sasbab may have insight into determining temporal bin sizes

HeikoH commented 9 years ago

From Flora:

Here is the rounding method the spatial stats team uses for Create Space Time Cube:

def timeExtentRounder(seconds):
    """Rounds given default temporal span in seconds into nearest meaningful
    block."""

    if seconds < 10:
        ARCPY.AddIDMessage("ERROR", 110037)
        raise SystemExit()
    elif seconds < 100:
        #### Less Than 100 Seconds = 1 second ####
        return 1, "1 Second"
    elif seconds < 300:
        #### Less Than 5 Minutes = 10 seconds
        return 10, "10 Seconds"
    elif seconds < 900:
        #### Less Than 15 Minutes = 30 Seconds
        return 30, "30 Seconds"
    elif seconds < 3600:
        #### Less Than 1 Hour = 1 Minute
        return 60, "1 Minute"
    elif seconds < 21600:
        #### Less Than 6 Hours = 5 Minutes
        return 300, "5 Minutes"
    elif seconds < 43200:
        #### Less Than 12 Hours = 30 Minutes
        return 1800, "30 Minutes"
    elif seconds < 86400:
        #### Less Than 1 Day = 1 Hour
        return 3600, "1 Hour"
    elif seconds < 259200:
        #### Less Than 3 Days = 2 Hours
        return 7200, "2 Hours"
    elif seconds < 864000:
        #### Less Than 10 Days = 6 Hours
        return 21600, "6 Hours"
    elif seconds < 7776000:
        #### Less Than 90 Days = 1 Day
        return 86400, "1 Day"
    elif seconds < 31536000:
        #### Less Than 1 Year = 1 Week
        return 604800, "1 Week"
    else:
        #### Round to Year or Months
        return decideMonthlyYearly(seconds)

sasbab commented 9 years ago

@phlorah, @ajturner @HeikoH

I recently connected Mark Janikas to my former development manager who is more expert on time series.

Here was his response: "The answer to your question is very complicated. If it is economic data, humans behave based on the hour-of-day, day-of-week, week-of-year, etc. For this data, seasonal dummies tests, seasonal augmented unit root tests, and others are useful. For other data with more complex cycles, it is best to breakdown the data first. For example, I like to use Singular Spectrum Analysis (SSA). See the attached paper."

I can forward the paper to anyone who wants it. Also, you might want to look at the SAS doc for PROC TIMESERIES, the procedure for aggregating and preparing time series data. The doc should be openly available online.

When creating UI experiences at SAS, we always asked the user the frequency they wanted to use because the problem they were trying to solve or question they were trying to answer should dictate the level of aggregation.

Note, if you are going to forecast or create a model, you need to be careful about incomplete bins at the beginning and end of the series (so that your aggregated values aren't deceptively low because the period is incomplete).

Esri / cedar

Create better time chart example #125