jeremyFreeAgent / Bitter

Bitter is a simple but powerful analytics library
http://bitter.free-agent.fr
MIT License
129 stars 18 forks source link

Question about implementation of bitDateRange #18

Open strongeye opened 11 years ago

strongeye commented 11 years ago

I am looking at the source code of Bitter and had a question about it.

in the function bitDateRange in Bitter.php (line 137+) it looks like you get two different DatePeriods, then do two array_diffs, and it looks like this sometimes will result in bitOfOr being performed twice for the $hoursTo array.

What is the purpose of doing two different calls to DatePeriod::createForHour ?

Here is my analysis of what you are doing (in words):

Is this meaningful? Why do you do anything other than return the actual hour span in all cases? Why do the bitOfOr twice?

Code below

// Hours $hoursFrom = DatePeriod::createForHour($from, $to, DatePeriod::CREATE_FROM); foreach ($hoursFrom as $date) { $this->bitOpOr($destKey, new Hour($key, $date), $destKey); } $hoursTo = DatePeriod::createForHour($from, $to, DatePeriod::CREATE_TO); if (array_diff($hoursTo->toArray(true), $hoursFrom->toArray(true)) !== array_diff($hoursFrom->toArray(true), $hoursTo->toArray(true))) { foreach ($hoursTo as $date) { $this->bitOpOr($destKey, new Hour($key, $date), $destKey); } }

jeremyFreeAgent commented 11 years ago

The aim of doing it "twice" is for doing the following with a dateFrom 2005-06-20 16:53:00 and a dateTo 2007-03-10 08:32:00 as example:

We want the data for theses hours:

Then data for theses days:

Then data for theses months:

Then data for these year:

Then data for theses months:

Then data for theses days:

Then data for theses hours:

I hope that help you.

strongeye commented 11 years ago

So I guess it seems that the idea is to capture only for granular time units where the next-level-up unit is not 100% covered, ie. hours that are not a full day, days that are not a full month, months that do not cover a full year, etc.

jeremyFreeAgent commented 11 years ago

Yes, that the point. We could get all the data with the smallest part (hours) but that not really good. Remember that bit works as: does something happened on that day? on that hour? on that year? If something happened this hour, it happened this day so this month so this year.