Closed brendt closed 5 years ago
$dateA = new DateTimeImmutable('2018-01-01');
$dateB = new DateTimeImmutable('2018-02-01');
$dateC = new DateTimeImmutable('2018-03-01');
$january = Period::make($dateA, $dateB);
$february = Period::make($dateB, $dateC);
We could make this configurable. But it should be in an easy to use way. The following seems rather verbose to write over and over again:
$period = Period::make(/* β¦ */)->excludeEnd();
We'd rather have some kind of user-configurable default, but this implies that a factory for periods should be added. That in itself is also a rather verbose solution. We could rely on static state when constructing a period, but I don't feel like that's a clean solution.
Input is appreciated.
If one sees this library as primary about dates and time is more or less an afterthought then the include boundaries option makes most sense to me.
If you think about time sensitivity the exclusion of end dates makes sense if you want to have sequential periods. Like one period starts at 11:30:00 and the second starts at 12:30:00. If you include boundaries and want not to have a gap between the two you will have to set fractions of seconds (e.g. first period ends at 12:29:59.999999). You are still not safe and have to consider floating point rounding and inaccuracy.
This variable boundaries issue will effect many other methods so I think it's up to the maintainer(s) to decide if they can afford to put this in.
@brendt If I can help you here's the reason why in League\Period I exclude the ending datepoint. Also adding the ending datepoint is a possibility but then it all depend on your public API and how users will be able to easily detect and manage the range inclusivity of your object.
As far as I've seen in the wild... both decisions have consequences for the complexity or not of your library
Thanks for the input both @IwfY and @nyamsprod, really appreciate it! We've also been discussing this at the office in depth.
I agree there's benefit to both []
and [)
intervals. My goal with this package is to reduce a developer's cognitive load when working with periods. I never want them having to stop and think "hang on, how should I do this in order for it to work correctly".
That's the reason I decided to always use DateTimeImmutable
, there's 0% chance your input will be changed.
I'm leaning towards supporting both, and maybe even also supporting (]
and ()
intervals.
@brendt Hi,
I'd like to share my comment.
If we think of time as being defined in a discrete manner, we can apply some set theory rules and say that overlapping two periods, means intersecting all discrete elements in that set (every discrete element). What comes out of this, is that overlapping is always defined in the same way and what changes is how we define our set. Currently it's defined with microsecond precision and with excluding first and last element.
But user of this library should be responsible for granularity on which it operates and how he defines his set. Different applications benefit from different definitions.
When defining vacations, we'd like to define them as both inclusive, but for example, bookings, we could have them defined as inclusive/exclusive because that helps with overlapping in that context.
But no matter how we define them, overlapping itself has the same semantics. So if the library user would also be responsible for defining granularity, there should not be any issue.
That also helps their domain model, for example:
<?php
declare(strict_types=1);
abstract class Period
{
/**
* @var DateTimeImmutable
*/
private $start;
/**
* @var DateTimeImmutable
*/
private $end;
public function __construct(DateTimeImmutable $start, DateTimeImmutable $end)
{
$this->start = $start;
$this->end = $end;
}
public function overlap(Period $period): Period
{
$start = max($this->start, $period->start);
$end = min($this->end, $period->end);
if ($start > $end) {
throw new LogicException('No overlap');
}
return $this->prototype($start, $end);
}
public function length(): DateInterval {
return $this->start->diff($this->end);
}
protected function prototype(DateTimeImmutable $start, DateTimeImmutable $end): self {
return new static($start, $end);
}
}
class VacationPeriod extends Period {
}
class BookingPeriod extends Period {
public function __construct(DateTimeImmutable $start, DateTimeImmutable $end)
{
// check that it's rounded to a day
parent::__construct($start, $end->sub(new DateInterval('P1D'))); // think of it as removing last element from the set
}
protected function prototype(DateTimeImmutable $start, DateTimeImmutable $end): Period
{
return new VacationPeriod($start, $end->add(new DateInterval('P1D'))); // think of it as adding last element to the set
}
}
$vacationA = new VacationPeriod(new DateTimeImmutable('2018-01-02'), new DateTimeImmutable('2018-01-09'));
$vacationB = new VacationPeriod(new DateTimeImmutable('2018-01-05'), new DateTimeImmutable('2018-01-14'));
var_dump($vacationA->overlap($vacationB));
$bookingA = new BookingPeriod(new DateTimeImmutable('2018-01-02'), new DateTimeImmutable('2018-01-09'));
$bookingB = new BookingPeriod(new DateTimeImmutable('2018-01-05'), new DateTimeImmutable('2018-01-14'));
$bookingC = new BookingPeriod(new DateTimeImmutable('2018-01-08'), new DateTimeImmutable('2018-01-14'));
var_dump($bookingA->overlap($bookingB));
var_dump($bookingA->overlap($bookingC));
Now, there are all kinds of variations to this, and what defaults the library offers (for example one for each combination with third argument being interval), but this could perhaps be one way of thinking. At least that's how I think of periods, I also need measuring unit info.
Thanks for all the input everyone!
I wanted to drop this poll I did on Twitter here: https://twitter.com/brendt_gd/status/1068507386149777408
I found especially this answer useful: https://twitter.com/hruske/status/1069511313011953664
My conclusion right now: it depends.
I think the only right thing to do is support all options, make it configurable and easy to use. The next question is how we can make this is as intuitive as possible.
My opinion.
The start has to be inclusive of course as you don't think of a range saying "How much did I make from the 1st of Jan" and actually go by selecting "31st Dec", right?
While the end should be inclusive by default but with that ->excludeEnd();
flag and this is because a filter could have a button for "Last Month" or "March" and that has to go from the first day of that month to its last, both inclusive. And the ->excludeEnd()
would be very much appreciated in many cases, cannot see them yet, but yeah.
While the end should be inclusive by default but with that ->excludeEnd(); flag and this is because a filter could have a button for "Last Month" or "March" and that has to go from the first day of that month to its last, both inclusive. And the ->excludeEnd() would be very much appreciated in many cases, cannot see them yet, but yeah.
I would go the other way around excluding by default and adding a marker to make the period inclusive. As pointed in the twitter thread PostgreSQL by default and most reporting accounting tools by default uses a left close, right open interval (ie exclude the ending datepoint) and also it is easier to create an inclusive interval from and exclusive one than the other way around because of date time precision.
Bottom line whichever road you choose making this easier for the developer to choose will be the crucial part of your library π
my 2 cents.
Would it help if you added convenience methods like:
Period::from($start)->toEndOf($end);
Period::from($start)->toStartOf($end);
Period::from($start)->until($end);
Period::from($start)->upTo($end);
Or something similar?
@shalvah the problem there is that ::from
constructs a period, even though no end date was given yet.
We could let ::from
create another object, not related to period, which in its turn has the toEndOf
and other methods. That's something to consider π€
@shalvah @brendt the issue is datetime precision quoting a tweet from @brendt
https://twitter.com/brendt_gd/status/1069508338273128449
If you and I both stay at the same hotel, I from 2019-01-01 until 2019-01-05, and you from 2019-01-05 until 2019-01-10. There is a chance of us meeting, at 2019-01-05; so why is it better to say these periods don't overlap?
With a left-closed right open period definition the answer is clear .. both client won't meet, end of the story.
With a left and right closed interval, you are required to precise your endpoint does until 2019-01-05
means until 2019-01-05 at midnight, a 12 AM or do you include the full day ?
What if we are now talking about hours span or year span ?
How does your library manage this precision? On the fly or should it be explicitly set ? In both cases, what are the rules (preferably using a RFC type document to back up your decision) ?
That's the main reason most library chooses left-closed right-open interval to avoid these issues altogether.
Once you've figure how to manage in a meaningful way your interval boundaries then you can implement a solution and the API becomes the least of your issue.
@brendt Also you're not the first to tackle this issue you should look at how Jodatime in Java and TimePeriod using .NET solved this issue... if they even try to. This should give you some references on how to deal with all type of intervals.
@nyamsprod I get why you say the following:
With a left-closed right open period definition the answer is clear .. both client won't meet, end of the story.
But we shouldn't forget that that answer is wrong! π
Given the question of the clients: they are able to meet on 2019-01-05
.
I've been reading and talking with people about this, and I've come to the conclusion that there is no correct answer. In some use cases, inclusive boundaries make sense, in others they don't.
It's clear to me that this library should support both approaches and that both should never allow for any confusion. That's the main goal of this package: ease of use, so that developers shouldn't worry about "am I doing the right thing?".
@brendt the answer is neither wrong or right, the answer depends on the context. The endpoint definition represents that context so I disagree. Also your initial question is too ambiguous from a scientific point of view. Depending on your clients cultural backgrounds your question has many answers. For instance:
until
means ? Some language have a specific way to convey that the full endpoint day is included some other don't or are vague about it . You should be more specific to remove say ambiguity.Bottom line:
They may meet but from reading your sentence there is no guarantee at all that they will without more context.
What the clients says and what your program except are two distinct things and your library needs to translate the client inputs into logical et reproducible actions before proceeding.
By saying that my answer is wrong means that you've already made some assumptions without evidence that those assumptions are corrects. On the other hand before giving my answer I've first define the rule used to interpret the given interval before stating my answer. π
@brendt @nyamsprod As I mentioned earlier, I don't think the issue is with definition of boundaries, but rather the definition of the set, unless you do want to think of time in a continuous terms.
If we think of time in discrete terms, it's easy to implement period. Here is a quick example inspired by PostgreSQL with a limited set of functionality that uses day as one discrete unit.
<?php
declare(strict_types=1);
class Period
{
/**
* @var DateTimeImmutable
*/
private $start;
/**
* @var DateTimeImmutable
*/
private $end;
/**
* @var string
*/
private $boundaries;
/**
* @var DateTimeImmutable
*/
private $realStart;
/**
* @var DateTimeImmutable
*/
private $realEnd;
/**
* @var DateInterval
*/
private $discreteUnit;
public function __construct(DateTimeImmutable $start, DateTimeImmutable $end, string $boundaries = '[]')
{
$this->assertValidBoundaries($boundaries);
// assume unit day for now, but can be generalized
$this->start = $start;
$this->end = $end;
$this->boundaries = $boundaries;
// @todo: assert that start/end are rounded on an day
$this->discreteUnit = new DateInterval('P1D');
// internally track closed intervals
$this->realStart = $boundaries[0] === '[' ? $start : $start->add($this->discreteUnit);
$this->realEnd = $boundaries[1] === ']' ? $end : $end->sub($this->discreteUnit);
}
public function overlap(Period $period): Period
{
$realStart = max($this->realStart, $period->realStart);
$realEnd = min($this->realEnd, $period->realEnd);
if ($realStart > $realEnd) {
throw new LogicException('No overlap');
}
// For now, respect this interval definition
$start = $this->boundaries[0] === '[' ? $realStart : $realStart->sub($this->discreteUnit);
$end = $this->boundaries[1] === ']' ? $realEnd : $realEnd->add($this->discreteUnit);
return new Period($start, $end, $this->boundaries);
}
public function contains(DateTimeImmutable $member): bool
{
return $member <= $this->realEnd && $member >= $this->realStart;
}
private function assertValidBoundaries(string $boundaries): void
{
if (!in_array($boundaries, ['[]', '[>', '<]', '<>'])) {
throw new LogicException(sprintf('Boundaries %s is not valid', $boundaries));
}
}
public function __toString(): string
{
return sprintf('%s%s, %s%s', $this->boundaries[0], $this->start->format('Y-m-d'), $this->end->format('Y-m-d'),
$this->boundaries[1]);
}
}
function runOverlap(Period $first, Period $second): void
{
printf("%s β© %s = %s\n", $first, $second, $first->overlap($second));
}
function runContains(DateTimeImmutable $member, Period $period): void
{
printf("%s in %s = %s\n", $member->format('Y-m-d'), $period, $period->contains($member) ? 'Yes' : 'No');
}
$period1 = new Period(
new DateTimeImmutable('2018-01-01'),
new DateTimeImmutable('2018-01-12')
);
$period2 = new Period(
new DateTimeImmutable('2018-01-08'),
new DateTimeImmutable('2018-01-18'),
'<]'
);
$period3 = new Period(
new DateTimeImmutable('2018-01-08'),
new DateTimeImmutable('2018-01-12'),
'<>'
);
runContains(new DateTimeImmutable('2018-01-01'), $period1);
runContains(new DateTimeImmutable('2018-01-05'), $period1);
runContains(new DateTimeImmutable('2018-01-12'), $period1);
runContains(new DateTimeImmutable('2018-01-12'), $period3);
echo "\n";
runOverlap($period1, $period2);
runOverlap($period2, $period1);
runOverlap($period3, $period1);
Produces
2018-01-01 in [2018-01-01, 2018-01-12] = Yes
2018-01-05 in [2018-01-01, 2018-01-12] = Yes
2018-01-12 in [2018-01-01, 2018-01-12] = Yes
2018-01-12 in <2018-01-08, 2018-01-12> = No
[2018-01-01, 2018-01-12] β© <2018-01-08, 2018-01-18] = [2018-01-09, 2018-01-12]
<2018-01-08, 2018-01-18] β© [2018-01-01, 2018-01-12] = <2018-01-08, 2018-01-12]
<2018-01-08, 2018-01-12> β© [2018-01-01, 2018-01-12] = <2018-01-08, 2018-01-12>
I think PostgreSQL work with days and seconds.
@zlikavac32 at first glance your implementation seems to be in accordance with what I try to convey :+1: ie to determine how interval interact with each other you first need to defined the endpoints type (unbounded, closed or open) first then you may determine their behaviour. The only part where a discussion should be made is how and when the $discreteUnit
property is seeded.
I feel safe to say that we all agree, sorry if I didn't make myself clear, guess I was caught in the moment.
@zlikavac32 I also like your example, it's clear that Period
should be configurable to work exactly as the user want it to work.
It's my understanding that this package now lacks 2 things, please correct me if I'm wrong:
$discreteUnit
$boundaries
Implementing these is pretty straight forward, and I will do so. What I'm not certain of is how they should be configured. Could you give your feedback on this?
Here are some of my personal thoughts on the syntax question:
I'm afraid $boundaries = []
is a little cryptic for some users. Remember that not everyone using this package is familiar with this interval notation. I like the cleanliness though, so we might add it as a possibility, just not the only one.
Coming from a Laravel background myself, and having done some research on how code is read, I feel like a more eloquent API should also be provided.
Something like Period::make()->endExcluded()->startExcluded()
or Period::make()->endIncluded()->startIncluded()
.
We could also add other static constructors like Period::makeInclusive()
, Period::makeHalfOpen()
, etc.
I also liked the idea suggested before: Period::from()->until()
and Period::from()->to()
, thought maybe that's a little overkill.
Regarding the discrete unit: I think the default should be 1 day, but it should also be configurable, probably via the constructor and make calls, but also via a fluent API.
I'm afraid $boundaries = [] is a little cryptic for some users.
There are not thousand types ... you could define them using constants and be done with it.
I feel like a more eloquent API should also be provided.
Fluent interfaces are good for Builder
or Factory
objects which is not what Period
is. Maybe you should offload Period creation to such object and make the constructor signature more obvious for people who do not come from a Laravel background. You get the best of both world like that.
<?php
interface RangeInclusivity
{
public const LEFT_CLOSED_RIGHT_OPENED = '[)';
public const LEFT_CLOSED_RIGHT_CLOSED = '[]';
public const LEFT_OPENED_RIGHT_CLOSED = '(]';
public const LEFT_OPENED_RIGHT_OPENED = '()';
}
class Period implements RangeInclusivity
{
public function __construct(
DateTimeImmutable $startDate,
DateTimeImmutable $endDate,
DateInterval $discretUnit,
string $boundaries = self::LEFT_CLOSE_RIGHT_OPEN
) {
....
}
}
$builder = new PeriodBuilder();
$period = $builder->build()->endIncluded()->startIncluded(); // and so on
Thanks for the input @nyamsprod!
@brendt @nyamsprod Honestly, I think that using anything more than '[)' (my mistake, I used <>
in examples above) is to verbose. If things are properly documented and concepts from "general knowledge" are used (and I think we all used in math notation that resembles this), people should note have a hard time using it.
I know that I'm happy when I don't have to write to much code and when things are easy to read.
Using builder to build period seems like to much hassle. There should be written a lot of code, and thinks may get tricky with call order.
If we consider units like '1 hour and 2 minutes' or '5 days' invalid, we could define supported units 'second', 'minute', ..., 'year' and use .
One approach could then be to expose that argument through the constructor and use enum to define possible values.
Other approach could be to have one class like DayPeriod
, etc for each unit. That could have better semantics when used, since, as we all agreed, period is defined with it's unit. With first approach, one could forget to check for unit and just check boundaries, which could mean that period of 7 days is period of one week.
@zlikavac32 using constant is a proven good practice you'll find in other domain object for instance HTTP status code are often turned into constant for ease. That's does not mean that the average developer does not know what 302
or 404
means but using a expressive constant makes the code more readable. Last but not least in my POC, the constants just mirror the established convention I did not intend to add or create something new that does not exist π
Also after some thoughts I still contain that adding a $discretUnit
to the class should not be included because it's at best an implementation detail (should be internal to the class and never exposed to the public API) at worst it has nothing to do with your domain logic.
Going back to @brendt tweet with the booking system to determine if those clients will or will not meet depends not only on the interval (covered by the inclusivity types) but more importantly on the hotel check-in/checkout rules.
If checkout is due at 10 AM and the other client arrives at 3 PM there's a good chance they won't meet regardless of the period boundaries. The $discretUnit
is just another way to round the period boundaries according to business rules and thus should be independent of your Period
definition otherwise your are tightly coupling 2 concepts into one domain.
By removing the $discretUnit
variable, you get ride of the builder class and you covers all scenario as the rounding is now the responsibility of your business logic and not the Period
class anymore.
class Period implements RangeInclusivity
{
public function __construct(
DateTimeImmutable $startDate,
DateTimeImmutable $endDate,
DateInterval $discreteUnit,
string $boundaries = self::LEFT_CLOSE_RIGHT_OPEN
) {
....
}
}
Should a range value object have knowledge of how it's going to be compared? Shouldn't the $boundaries
be determined during the comparison, not before?
$a->overlaps($b, Period::LEFT_CLOSE_RIGHT_OPEN);
@nyamsprod Yes, you are right. At first I thought it would be like 1
, 2
, ..., 4
. With [)
both options are available.
As for other things regarding discrete unit and boundaries. It all depends in which way this library wants to go and what are the defaults :).
@sebastiandedeyne Wouldn't it be easy to misuse this? For example, someone uses wrong type of boundary and introduces subtle bug? Also, do you apply this on both intervals?
I believe @sebastiandedeyne is saying that the open/closed flag is applied to the comparison, instead of one or two periods.
@zlikavac32 If someone uses the wrong boundary during a comparison, what's stopping them from using the wrong boundary when creating the period.
It's my opinion that applying the boundary to the comparison, instead of two periods, will be much more clear for developers.
This approach doesn't allow comparing two periods with different boundary types, but I think that this is the point where we should ask ourselves if supporting those edge-cases is within the scope if this library.
I like the flags to indicate inclusion/exclusion; but I think we can avoid the mathematical notation altogether. What about a set of flags that say exactly what the intention should be? INCLUDE_START_INCLUDE_END
, INCLUDE_START_EXCLUDE_END
etc.
Changed my mind about the boundaries on comparison thing for two reasons:
@brendt Once the period is created, boundaries stay the same. If you used wrong boundary, yes, it's wrong, but you can easily "enforce" that by having domain class which uses correct boundary.
Assumption is also that there are more places where you overlap than there are places where you create period.
Also, what about consecutive overlaps? How is the period defined then? How would you propagate information that newly generated period is closed on one side?
@sebastiandedeyne Yes, boundary is the property of the set. Operations like union/intersect that work on sets don't care about boundary
Okay, good points! Boundaries stay on the period!
I'll try to submit a PR next week for you to review.
PR with WIP implementation: https://github.com/spatie/period/pull/11 let's continue the discussion there.
Boundaries are now supported: https://github.com/spatie/period/releases/tag/0.5.0
Right now, the boundaries of a period are always included in calculations. We should have a discussion about what the desired default behaviour should be, whether we want to make this behaviour configurable, and how.