Open hodb opened 1 year ago
Matching operators to their duties on each day is something that I would use, and if it's not in ODS, I'd have to represent the data in a non-standard adjacent data feed. I'm currently using a CSV with 3 columns for this purpose: date
, run_id
, operator
.
But it's also kind of separate from the schedules ODS does represent. For example, assigning operators to runs is done well after the rest of the schedule is finalized.
I think I remember this coming up in discussions a while ago and I think it was just cut for scope. It'd be easy to add as a backwards-compatible extension once the rest of the format is finalized.
Thanks for your answer @skyqrose. The concept I aim for is pretty much the same - having a new file driver_assignment.txt that includes the assignment per driver -
run_id,driver_id,operational_date
And a vehicle_assignment.txt file that includes assignment per vehicle
block_id,vehicle_id,operational_date
I think I remember this coming up in discussions a while ago and I think it was just cut for scope. It'd be easy to add as a backwards-compatible extension once the rest of the format is finalized.
Yes, this was discussed and cut for scope and also because there seemed to be less consensus on the topic of representing driver/staff-identifying information, at least at that time. I remain in favor of bringing this information into ODS, however.
For example, assigning operators to runs is done well after the rest of the schedule is finalized.
Can we be clear about what the intended use case is here? I want to make sure I understand who is producing these assignments and who the consumer will be. If the run assignments are done via the same producer after the schedule is finalized but before it would be sent to the schedule consumer, that wouldn't seem like it would impact our decision to include those assignments in the ODS-represented schedule. But I want to be sure that's an accurate read of what you both have said.
At MBTA, the use case is to export data from HASTUS and distribute it to multiple internal apps using a standard format, instead of the status quo of each app getting a custom export in a custom format. (This data would probably also make it outside of MBTA to Swiftly, but that's not my side of the house so I'm not sure.)
So far, exporting the trip schedules and exporting the operators' schedules have been completely separate processes, and exporting operators' schedules for each season typically happens a couple weeks after the trips. If we didn't have this proposal, we would keep doing the separate operator export alongside ODS. If we did, then I think we would publish an internal ODS feed without operator schedules as soon as the trips are scheduled, and then re-publish an updated ODS feed with operator schedules once those become available.
Does that answer the question?
Depending on how the issues with non-unique run_ids go (see #12), it's possible that the drivers file would need to have a 4th column for the service_id. If runs are unique among all the services effective on a date (but not unique among all services), it wouldn't be necessary, but could still make looking up the run easier.
MBTA hasn't run into this specific uniqueness problem yet with our current use cases, but I expect we would if this file was standardized and we used it in more places.
I'm not sure if our block_ids are unique, but I guess the same problem could happen in principle to the vehicles file.
Can we be clear about what the intended use case is here? I want to make sure I understand who is producing these assignments and who the consumer will be. If the run assignments are done via the same producer after the schedule is finalized but before it would be sent to the schedule consumer, that wouldn't seem like it would impact our decision to include those assignments in the ODS-represented schedule. But I want to be sure that's an accurate read of what you both have said.
Adding on top of @skyqrose's answer. The dispatching software and the AVL vendor might be different. This creates a scenario where we are unable to communicate through the same format and are required to examine each vendor's unique specification which includes different data elements. From the customer's end, if they're using two different systems in place, they are required to manually do the allocations in both the dispatching and the AVL software, which can be inefficient. If we don't include the actual assignments in the ODS format, we would be able to convey information about the runs/blocks but not the dispatching portion. Extending ODS for including assignments, creates a clear path for clear communication between two different software providers
Question re: proposed driver_assignment.txt
file
Would it be reasonable to say this same use applies to personnel more broadly (that is to say, not just to drivers/vehicle operators)? Does the job performed need to be identified in the new field? Or will it (always) be self-evident from the operator id what job they are performing?
Trying to get at whether we need to make our language more job-agnostic and how we can either support other roles or at least bake in some extensibility for that support in future
As proposed, would we be imposing a limitation that a driver's run assignment would always be for the full run. Is that desirable? Would it make sense to include the piece information in the file as well?
Some questions on proposed vehicle_assignment.txt
:
vehicle_id
can pull from? [we also have #30, which currently doesn't have a vehicle_id
field but should.]one more q for driver_assignment.txt
I think that this file could apply to other jobs, if they have run ids, which seems very possible.
You can't determine job based on operator id, since some employees might do different jobs on different days. But you can determine the job based on run id. And if you want a run_id to job_id mapping, then that'd be better to do in runs.txt
than in driver_assignment.txt
. So I don't think it should go here, but we should consider it there.
Edit: On second thought maybe run_id isn't enough, because an employee could do multiple jobs within the same run, e.g. operator in the morning, then shifter in evening, so it'd have to go in runs.txt
.
Yes, this would limit assignments to a full run. I think by the definition of a run, that's okay. If you plan to assign multiple operators to different pieces of the same run, then those are really two separate runs.
If you don't plan to assign multiple operators, but end up having to because you're filling absences or if an operator has a half-day conflict or something, then that's not necessarily going to align to piece boundaries, and it'd require a way to specify assignments per trip, which is way more detail than ODS should have and is a step away from schedule data towards realtime data so is out of scope anyway.
(no comment on vehicles)
I don't think we should associate operators to bids instead of runs+dates. If we did, we'd need a separate way to map bids to runs to runs+dates, and every lookup would have to go through that level of indirection. It'd also make it impossible to reassign people on individual days, for example if vacation dates are chosen after bidding. The important operational detail we want to capture is which employees are working, and the bidding process is just an administrative detail to get there that doesn't matter if we have the end result.
Would it be reasonable to say this same use applies to personnel more broadly (that is to say, not just to drivers/vehicle operators)? Does the job performed need to be identified in the new field? Or will it (always) be self-evident from the operator id what job they are performing?
Trying to get at whether we need to make our language more job-agnostic and how we can either support other roles or at least bake in some extensibility for that support in future
It could be extended, but what is the use case for associating another personnel that is not performing a run?
As proposed, would we be imposing a limitation that a driver's run assignment would always be for the full run. Is that desirable? Would it make sense to include the piece information in the file as well?
It is fair to say that a duty/run can be "cut" and split between different drivers. In that case, we would need to have some unique identifier on how many pieces the duty is split between drivers and on what date (which we have already). Since a duty might repeat itself every day but on a different date. For instance - 1/3, 2/3, 3/3).
Some questions on proposed
vehicle_assignment.txt
:
- are there different types of assignments we need to account for?
- how do we account for the same vehicle being assigned to multiple blocks on a given date? or the same block assigned to multiple vehicles?
- is there a case where a block might be assigned no vehicle within the schedule, and instead only assigned on the day of operations?
- do we need a master list of vehicles that
vehicle_id
can pull from? [we also have add Vehicles.txt #30, which currently doesn't have avehicle_id
field but should.]one more q for
driver_assignment.txt
- would it make sense for this assignment to be made between an operator and a bid instead to individual runs? with the bid representing a package of runs/pieces during a specific window of time?
Separating out the different discussion threads here:
[x] Driver vs job-agnostic personnel assignments
Rows in this file could be applicable to any personnel that has a run assignment. This will often, but not always, be an operator.
@skyqrose has provided examples of multi-personnel trips/runs in #54
Proposal in that issue is to designate job performed within runs.txt
rather than in driver_assignment.txt
Further discussion on this can go to #54
[ ] During scheduling, will a driver always be assigned to a full run?
Difference of opinion here:
[Sky brings up the case of having to make post-scheduling adjustments that result in more than one operator on a run. I agree that this case is out of scope for ODS and doesn't need further consideration]
This question should perhaps go out to the full working group
[ ] During scheduling, will a vehicle always be assigned to a full block?
Much the same applies here; we seem to have the same options as to whether we allow or disallow the blocks to have multiple vehicle assignments or vice versa
Hod proposes similar approach as for above
[x] Assignment types for vehicles
Sort of fishing but also a genuine question: might a schedule assign a revenue vehicle to something other than revenue service (be it, idk, standby, relief, or what have you)? Could that or should that be able to be represented as a block?
Hod says he doesn't think this is necessary; I'm inclined to agree at this stage unless we receive other feedback
[x] Block with zero vehicles assigned
The important point here is whether we end up in a situation where we have a complete list of blocks in the scheduling system, but only a subset of those blocks is captured in ODS, leaving the consumer app with an incomplete list or description of blocks
Is there a case where this would happen? How would agencies handle this if they were scheduling during an operator shortage, for example? Is it possible that a defined block would not be assigned a vehicle until, perhaps, day-of (and thus not in ODS)?
[x] Personnel link to bids instead of to jobs?
Sky argues against this as it would obfuscate the information that the consumer actually cares about: the specific work being performed by the operator
this seems like a convincing argument to me, unless we receive other feedback
Going back to comment on vehicles:
vehicle_assignment.txt
, or would leave Green Line blocks out of the file. Consumers should be able to handle blocks without scheduled vehicles (and also runs without assigned employees, which also concretely happens in our data).Ok, that makes sense. As it stands in GTFS and ODS, blocks are defined in trips.block_id
and/or deadheads.block_id
, so even if a block doesn't have a assignment in vehicle_assignments.txt
, we still can have a full definition of what that block's contents are. I'll mark Block with zero vehicles assigned
above as resolved
For the train use case, wouldn't we then also need multiple vehicles assigned per block (analogously to the multiple employees per run)?
It also could include a new (optional) field order_in_driving_direction
so that the order of cars becomes clear if the provinding system knows it.
We can also ignore this for now, but I think the specification needs to make some statement on assignments of multiple vehicles to a single block, because technically, users will be able to do that in the proposed format. We should clarify expectations here (explicitly allow or disallow).
As mentioned in another issue, driver_assignments.txt
is likely to end up containing more than drivers. staff_assignments.txt
would be more generic.
If we want to allow multiple operators on the same run / multiple vehicles on the same block, maybe it's as simple as not enforcing any uniqueness guarantees on these files?
The order_in_driving_direction
couldn't apply in this file, because a operator/vehicle is likely to be in a different order on different trips (like if a train turns around and the first car becomes the last). But maybe it could work in runs.txt
?
@safrazier17 @skyqrose This issue is currently marked as "Included" in ODS 2.0 here, but I don't see a final proposal of how to add the new files.
Also, the discussion about driver allocations was split off from this issue above, but there it states that the driver assignments should be in the new run_events.txt
file - I cannot spot any reference to drivers or staff there. So it seems to me, that we still need to discuss both driver and vehicle assignments as part of this issue?
It's been a while, but this issue has come under discussion again for the rostering working group for TODS 2.1 this quarter. Lots of things have happened in this thread and in the rest of TODS, so here's a recap of where we are, with some new ideas and personal opinions mixed in:
These two files record which vehicle/person is scheduled to do each block/run. They don't have to cover adjustments made after the schedule is made. They don't describe what's included in those blocks/runs (that's already in trips.txt
and run_events.txt
).
vehicle_assignments.txt
:
Column Name | Required? | Description |
---|---|---|
date | Required | |
service_id | Optional | Corresponds to the service_id that the block is on in trips.txt . Recommended if block_id s are repeated between different service_id s.(Edit: This column added after discussion.) |
block_id | Required | Corresponds to trips.txt:block_id |
vehicle | Required | Might refer to a vehicle, a train, or a type of vehicle, depending on what happens with the vehicle-related proposals this quarter? Maybe we'd have separate vehicle_type and vehicle_id fields, where you must fill at least one of the columns? We'll need to sort out the details but how to refer to vehicles isn't the hard part of this proposal. |
Uniqueness:
employee_assignments.txt
Column Name | Required? | Description |
---|---|---|
date | Required | |
service_id | Optional | In run_events.txt , the unique ID for a run is (service_id, run_id) , so this is recommended to make it clear which run the row refers to. It's may not be needed, because you might be able to look up a service_id via the calendar, or your run_ids might be unique even between days. But including it could help prevent errors. Probably needs more discussion. |
run_id | Required | refers to run_events.txt . (Will require some discussion to resolve how this interacts with #76) |
employee_id | Optional | Who's doing this run on this day? (If blank, then nobody's scheduled to do it.) |
Not included: The type of job being performed is described in run_events.txt
, not this file. There was also discussion of order_in_driving_direction
, but that could change throughout a run, and if we need that, it'd be better to add it to run_events
.
Uniqueness:
run_events.txt
but not this file, then we just don't know which employee is doing it.There was some discussion (comment by hodb) about "cutting" a run between different employees. If this was about scheduling an employee to a run on some dates but not every date, then that's already handled because this file would assign employees on one date at a time. If this was about scheduling multiple people to the same run on the same date, then what I've written above wouldn't work, and it'd need to change. @hodb, which case were you describing?
If we do need it, here are some other ideas for how to handle scheduling an employee to only part of a run on a day:
piece_id
column. If it's filled, then the employee is only assigned to the portion of the run on that piece (as described by the piece_id
column in run_events.txt
).event_sequence
field. If it's filled, the employee is only assigned to the one trip/event on that run. (An employee assigned to half a run would probably have many rows in this file to assign them to each of the events they do that day.) This would be the most granular but also the most verbose and complex.Edit: This was all about realtime adjustments to the roster, not the schedule, so is out of scope.
First, instead of employee ids, it gives a roster ids. The roster groups together runs that one person would do across multiple days, but doesn't say which person it's assigned to.
In the working group, we'll need to decide whether we want to include specific employees (as in this issue), the bid/roster (as in 45), or both. I lean towards employees being the right level of abstraction to include (I justify that here, in the 3rd section), but it depends on what uses people have for TODS.
Second, instead of working date-by-date, #45 works week-by-week, giving a monday schedule, a tuesday schedule, etc. This is similar to the difference between calendar.txt
and calendard_dates.txt
. We could work with either approach alone, or both together.
This file would require a lot of rows (one per run per day), and doing it week by week would be more compact. But this file also makes it easier to deal with vacations and irregular schedules, which I think is valuable given that employee's schedules are less likely to be regular than the service schedules in public GTFS.
We don't have to do only #28 or only #45. We could do both, or recombine the two in new ways, according to what information people want represented in TODS.
In summary, this issue is moving again, but there are still some major decisions to be made about which information and use cases are important. We'll discuss some of it in the working group meetings, but a lot of things are a lot easier to bring up in writing, so leave comments!
@skyqrose Thanks for wrapping up the current state so thoroughly. The draft in this ticket looks good to me already, I just wanted to leave feedback on selected points:
vehicle_type
and actual vehicle
in vehicle_assignments.txt
. It sounds most useful to allow both columns and require at least one of them to be filled.order_in_driving_direction
for now, but I don't think we can use the runs to model it. Imagine a train consisting of three coupled cars who never change their order during the day (the train stays coupled). If this is published as three vehicles being assigned to a single block, operated by a single run, then the only place to document the order of the cars would be in vehicle_assignments.txt
.(service_id, run_id)
as the run key in employee_assignments.txt
Thanks for the detailed analysis @skyqrose! the use case I was referring to is the fact that a driver doesn't complete his scheduled run, and another driver would need to take care of it. Dispatchers may deal with unexpected situations during the day that can disrupt a trip or the driver’s day, malfunctioning vehicle, a driver getting sick during the work day, a bus breaking down in the middle of a trip, removing the workload of a driver, and such. We're not talking about multi-assignment but the ability to have modifications to the runs and blocks and some reference. It is fair to say that a duty/run can be "cut" and split between different drivers. In that case, we are creating a subset of the original duty performed by different drivers - For instance, the original duty - 123 was planned for driver A, but ended up being split between A and B. Having the cut means we may need a reference to the run_events.txt file to indicate this duty is done by different drivers
Ah, that makes sense, and it'd be nice to have a spec for that, but TODS so far has just been scheduled data, and I think the scope for now should just be to represent the runs as they're assigned in the schedule. In that case, then it seems like runs would never be scheduled to be split between multiple people.
Realtime changes to the schedule would then have to be in a future TODS-Realtime specification, similar to how GTFS-RT is a separate spec for realtime changes to the schedule in GTFS.
Revising to my comment above: Perhaps we should omit service_id
from the assignments.
If we would have it, it would make more sense to have it in both files (vehicles and employee assignments) to keep them consistent. If run IDs are not unique on a given date, then block IDs would perhaps also not be unique.
But it also does not make much sense to have multiple runs or blocks on the same date with the same ID. I think we can assume that real-world usage will have unique block and run IDs on every single date.
Two reasons to include service_id
, given that runs are not unique on different services, even if they're unique within all services on the same date:
run_events.txt
. If the service isn't listed here, you have to look up in the calendar which services are active, and then find an entry in run_events.txt
that matches the run_id and also happens on any one of those services.(service_id, run_id)
pair prevents those bugs.And I think all this applies to blocks and vehicle_assignments as well.
Makes sense, but then let's include the service_id
in both assignment files?
In reviewing the ODS spec, it appears that there is an absence of guidance regarding the allocation of blocks to operational vehicles, as well as the assignment of duties to drivers with their matching operated dates
As this crucial information is expected to be sourced from the dispatching software, can you provide clarification or additional details on how the ODS spec addresses the extraction of data related to block allocation, driver duties, and associated dates from the dispatching software for communication to the AVL provider?
Thanks