cloudfoundry-attic / cf-abacus

CF usage metering and aggregation
Apache License 2.0
98 stars 86 forks source link

Revisit the structure of time windows #310

Open betafood opened 8 years ago

betafood commented 8 years ago

This touched upon a little in issue #300

Currently, the time windows are structured as an array of arrays. The larger array has 5 elements pertaining to the following: month, day, hour, minute, second. Within each smaller array, each element stands for each window in that dimension. Where the first element is the current window of time, and every subsequent element is current window of time - index

Suffice to say, this can be a little confusing and could be using more space than necessary. For the outer array, it could potentially just be an object with each property pertaining to a dimension. For instances

{
"M": {},
"D": {},
"h": {},
"m": {},
"s": {}
}

This may make it easier to understand the structure and allow removal of things like:

// Time dimension keys corresponding to their respective window positions
const dimensions = ['s', 'm', 'h', 'D', 'M'];

@rajkiranrbala also proposed a way of handling each individual window within each dimension in the above issue in a way that there would no longer be any unnecessary empty windows. Essentially, within a dimension, the properties would be a number pertaining to the number behind the current window.

For example

{
  "D" : {
    "0": { "quantity": 200 },
    "4": { "quantity": 300 }
  }
}

This would prevent unnecessary creation of days - (1 to 3) when they have no usage in the first place.

Thoughts?

cf-gitbot commented 8 years ago

We have created an issue in Pivotal Tracker to manage this. You can view the current status of your issue at: https://www.pivotaltracker.com/story/show/118514849.

rajkiranrbala commented 8 years ago

In addition to this we can also specify the minimum window level from which accumulation/aggregation should happen. If we set the level to 'D', then the accumulation/aggregation happens for day, week and month.

jsdelfino commented 8 years ago

We've discussed this in the last few days and I suggested to keep that array as an array (as objects would introduce property names that will eat up a lot of space) but use a CSR or CSC sparse matrix representation [1] for it to reduce the space used by empty array cells.

BTW a sparse matrix representation would also allow us to extend our 'slack' windows (to a month maybe) without having to worry about space anymore, as most cells would remain empty and wouldn't have to be stored.

We could also turn these arrays to an object representation with M/D/H/m/s properties indicating column headers in the reports returned by our reporting service if you guys think that it'd greatly improve the readability of the reports.

Thoughts?

[1] https://en.wikipedia.org/wiki/Sparse_matrix

hsiliev commented 6 years ago

We will remove seconds, minutes and hours windows as we saw no proof of real world use.