Open BTollison opened 9 months ago
I can see how a grouping of many days' work is useful to represent even when there's no specific person assigned to it yet, as part of the scheduling process. So far, ODS has been mostly focused on data useful during operation, as opposed to during planning+scheduling. Would this be useful during operation (like if you need to check if the usual driver is working today), or is this intended to expand the scope of ODS into scheduling?
Would it be a specific week? I don't see a date field. There's a service_id field, but service_ids can be a many days or only 1, consecutive or not. What does the sunday_run
field mean if applied to a service_id that only occurs on weekdays? How would you represent how holiday work is grouped with other work?
If it is meant to represent specific dates (as opposed to a generic week), it kind of turns into #28, with 1/7th the rows but 7x the columns. What if you leaned into that, and had a table with 3 columns, operational_date, run_id, roster_id
?
This is meant to take the scheduling data (Hastus, Optibus, etc) and send it to the operational side. Although I can see how on the scheduling side you could also assign drivers to every roster_id. In my previous work in the USA it was common to assign a driver to a roster during a bidding process. So, I am open to including this information somehow in a rosters.txt.
Since my intention here was to only model what comes from scheduling and not what occurs on the operational side (yet) I had not intended this to be tied to a specific week so much as tied to a specific service period (service_id). You raise a good point though, if a roster_id and service_id are overlapping you'll get duplicate roster_id's:
Thanks for the tables, that helps show the idea.
This is what I meant by the 3 columns in #28. There wouldn't be a separate exceptions file like calendar
and calendar_dates
, just a single file with all the dates. It's more duplicated rows cuz you need one for every date on the roster, but more straightforward, easier to have exceptions like holidays, and dodges some of the confusion about whether a service is active on a day of the week. I think it can represent everything that the day-of-the-week system can (as long as you want to say someone works on these 3 specific Mondays, and don't need to say someone typically works on Mondays in general.)
roster.txt | ||
---|---|---|
date | roster_id | run_id |
20240107 | 1 | 2 |
20240108 | 1 | 4 |
20240109 | 1 | 5 |
20240110 | 1 | 6 |
20240113 | 1 | 3 |
20240114 | 1 | 2 |
20240115 | 1 | 4 |
20240116 | 1 | 5 |
20240117 | 1 | 6 |
20240120 | 1 | 3 |
20240121 | 1 | 2 |
20240122 | 1 | 2 (MLK day) |
20240123 | 1 | 5 |
20240124 | 1 | 6 |
20240127 | 1 | 3 |
20240107 | 2 | ... |
Okay, I see what you're getting at now. My only concern with this is how many repeated rows we'd have.
I guess then for this to work, you need a way to distinguish which run_id is valid since there could be multiple run_id's with the same name, right?
Oh, yeah, my agency has issues with non-unique run ids, too, that's described in #12. Potential solutions for here:
I'm gonna go make sure #28 doesn't run into this problem, too, thanks for noticing that.
I would strongly advise against allowing duplicate primary keys. Even if their existence is commonplace, it goes against some fundamental relational database best practices.
I do agree the 3-column option makes it easier to be more explicit about dates of a given roster and requires less calculation/interpretation, though you'd probably want a second table to normalize rosters.
So I'm currently attempting to revise rosters.txt
proposal, but it occurs to me that in the calendar.txt
style, we run into the problem of:
How do we handle a roster with service_id's? I end up in a situation where I need 3 rows to describe 1 week of rosters in a "simple" scenario where we have a weekday, saturday, and sunday service_id. We need a way to model this nicely to feed rosters in systems, the calendar_dates.txt style only works in an operational context I think.
The best idea so far that I have is a file just to group service_id's that are relevant...
grouped_calendar.txt
service_id | group_service_id |
---|---|
weekday | 9999 |
saturday | 9999 |
sunday | 9999 |
rosters.txt
group_service_id | roster_id | week_sequence | monday_run | tuesday_run | wednesday_run | thursday_run | friday_run | saturday_run | sunday_run |
---|---|---|---|---|---|---|---|---|---|
9999 | 1 | 1 | 10 | 20 | 30 | 40 | 50 | ||
9999 | 1 | 2 | 80 | 90 | 100 | 110 | 70 | ||
9999 | 1 | 3 | 120 | 130 | 140 | 60 | 150 |
roster_dates.txt
grouped_service_id | date | roster_id | run_id | exception_type |
---|---|---|---|---|
9999 | 20240830 | 1 | 555 | 1 |
For roster_dates.txt
exception_type
would have the proposed values:
exception_type | Definition |
---|---|
NULL | no rosters.txt, and is therefore the default |
1 | replacement |
2 | removed |
In an effort to help distribute information between platforms, we wish to add rosters.txt. The intended use is that many operations cluster several duties (runs) within a roster to assign to a driver. The duties are usually clustered in a way that represents the amount of work a single driver will work in a week period. Some companies also cluster this work in several weeks for 1 drivers.
ods_roster_id Persistent ID as proposed in issue #44 meant to help keep track of changes to a roster over time. Rosters can have long lifespans while the duties (runs) can change within them over time.
roster_id
service_id
monday_run
tuesday_run
wednesday_run
thursday_run
friday_run
saturday_run
sunday_run
week_sequence