cal-itp / operational-data-standard

The Transit Operational Data Standard is an open standard for representing the transit schedules used by drivers, dispatchers, and planners to carry out transit operations.
https://ods.calitp.org
Apache License 2.0
28 stars 7 forks source link

Allocating drivers and vehicles in ODS #28

Open hodb opened 1 year ago

hodb commented 1 year ago

In reviewing the ODS spec, it appears that there is an absence of guidance regarding the allocation of blocks to operational vehicles, as well as the assignment of duties to drivers with their matching operated dates

As this crucial information is expected to be sourced from the dispatching software, can you provide clarification or additional details on how the ODS spec addresses the extraction of data related to block allocation, driver duties, and associated dates from the dispatching software for communication to the AVL provider?

Thanks

skyqrose commented 1 year ago

Matching operators to their duties on each day is something that I would use, and if it's not in ODS, I'd have to represent the data in a non-standard adjacent data feed. I'm currently using a CSV with 3 columns for this purpose: date, run_id, operator.

But it's also kind of separate from the schedules ODS does represent. For example, assigning operators to runs is done well after the rest of the schedule is finalized.

I think I remember this coming up in discussions a while ago and I think it was just cut for scope. It'd be easy to add as a backwards-compatible extension once the rest of the format is finalized.

hodb commented 1 year ago

Thanks for your answer @skyqrose. The concept I aim for is pretty much the same - having a new file driver_assignment.txt that includes the assignment per driver - run_id,driver_id,operational_date

And a vehicle_assignment.txt file that includes assignment per vehicle block_id,vehicle_id,operational_date

safrazier17 commented 11 months ago

I think I remember this coming up in discussions a while ago and I think it was just cut for scope. It'd be easy to add as a backwards-compatible extension once the rest of the format is finalized.

Yes, this was discussed and cut for scope and also because there seemed to be less consensus on the topic of representing driver/staff-identifying information, at least at that time. I remain in favor of bringing this information into ODS, however.

For example, assigning operators to runs is done well after the rest of the schedule is finalized.

Can we be clear about what the intended use case is here? I want to make sure I understand who is producing these assignments and who the consumer will be. If the run assignments are done via the same producer after the schedule is finalized but before it would be sent to the schedule consumer, that wouldn't seem like it would impact our decision to include those assignments in the ODS-represented schedule. But I want to be sure that's an accurate read of what you both have said.

skyqrose commented 11 months ago

At MBTA, the use case is to export data from HASTUS and distribute it to multiple internal apps using a standard format, instead of the status quo of each app getting a custom export in a custom format. (This data would probably also make it outside of MBTA to Swiftly, but that's not my side of the house so I'm not sure.)

So far, exporting the trip schedules and exporting the operators' schedules have been completely separate processes, and exporting operators' schedules for each season typically happens a couple weeks after the trips. If we didn't have this proposal, we would keep doing the separate operator export alongside ODS. If we did, then I think we would publish an internal ODS feed without operator schedules as soon as the trips are scheduled, and then re-publish an updated ODS feed with operator schedules once those become available.

Does that answer the question?

skyqrose commented 11 months ago

Depending on how the issues with non-unique run_ids go (see #12), it's possible that the drivers file would need to have a 4th column for the service_id. If runs are unique among all the services effective on a date (but not unique among all services), it wouldn't be necessary, but could still make looking up the run easier.

MBTA hasn't run into this specific uniqueness problem yet with our current use cases, but I expect we would if this file was standardized and we used it in more places.

I'm not sure if our block_ids are unique, but I guess the same problem could happen in principle to the vehicles file.

hodb commented 11 months ago

Can we be clear about what the intended use case is here? I want to make sure I understand who is producing these assignments and who the consumer will be. If the run assignments are done via the same producer after the schedule is finalized but before it would be sent to the schedule consumer, that wouldn't seem like it would impact our decision to include those assignments in the ODS-represented schedule. But I want to be sure that's an accurate read of what you both have said.

Adding on top of @skyqrose's answer. The dispatching software and the AVL vendor might be different. This creates a scenario where we are unable to communicate through the same format and are required to examine each vendor's unique specification which includes different data elements. From the customer's end, if they're using two different systems in place, they are required to manually do the allocations in both the dispatching and the AVL software, which can be inefficient. If we don't include the actual assignments in the ODS format, we would be able to convey information about the runs/blocks but not the dispatching portion. Extending ODS for including assignments, creates a clear path for clear communication between two different software providers

safrazier17 commented 11 months ago

Question re: proposed driver_assignment.txt file

Would it be reasonable to say this same use applies to personnel more broadly (that is to say, not just to drivers/vehicle operators)? Does the job performed need to be identified in the new field? Or will it (always) be self-evident from the operator id what job they are performing?

Trying to get at whether we need to make our language more job-agnostic and how we can either support other roles or at least bake in some extensibility for that support in future

safrazier17 commented 11 months ago

As proposed, would we be imposing a limitation that a driver's run assignment would always be for the full run. Is that desirable? Would it make sense to include the piece information in the file as well?

safrazier17 commented 11 months ago

Some questions on proposed vehicle_assignment.txt:

one more q for driver_assignment.txt

skyqrose commented 11 months ago

I think that this file could apply to other jobs, if they have run ids, which seems very possible.

You can't determine job based on operator id, since some employees might do different jobs on different days. But you can determine the job based on run id. And if you want a run_id to job_id mapping, then that'd be better to do in runs.txt than in driver_assignment.txt. So I don't think it should go here, but we should consider it there.

Edit: On second thought maybe run_id isn't enough, because an employee could do multiple jobs within the same run, e.g. operator in the morning, then shifter in evening, so it'd have to go in runs.txt.


Yes, this would limit assignments to a full run. I think by the definition of a run, that's okay. If you plan to assign multiple operators to different pieces of the same run, then those are really two separate runs.

If you don't plan to assign multiple operators, but end up having to because you're filling absences or if an operator has a half-day conflict or something, then that's not necessarily going to align to piece boundaries, and it'd require a way to specify assignments per trip, which is way more detail than ODS should have and is a step away from schedule data towards realtime data so is out of scope anyway.


(no comment on vehicles)

I don't think we should associate operators to bids instead of runs+dates. If we did, we'd need a separate way to map bids to runs to runs+dates, and every lookup would have to go through that level of indirection. It'd also make it impossible to reassign people on individual days, for example if vacation dates are chosen after bidding. The important operational detail we want to capture is which employees are working, and the bidding process is just an administrative detail to get there that doesn't matter if we have the end result.

hodb commented 11 months ago

Would it be reasonable to say this same use applies to personnel more broadly (that is to say, not just to drivers/vehicle operators)? Does the job performed need to be identified in the new field? Or will it (always) be self-evident from the operator id what job they are performing?

Trying to get at whether we need to make our language more job-agnostic and how we can either support other roles or at least bake in some extensibility for that support in future

It could be extended, but what is the use case for associating another personnel that is not performing a run?

As proposed, would we be imposing a limitation that a driver's run assignment would always be for the full run. Is that desirable? Would it make sense to include the piece information in the file as well?

It is fair to say that a duty/run can be "cut" and split between different drivers. In that case, we would need to have some unique identifier on how many pieces the duty is split between drivers and on what date (which we have already). Since a duty might repeat itself every day but on a different date. For instance - 1/3, 2/3, 3/3).

Some questions on proposed vehicle_assignment.txt:

  • are there different types of assignments we need to account for?
  • how do we account for the same vehicle being assigned to multiple blocks on a given date? or the same block assigned to multiple vehicles?
  • is there a case where a block might be assigned no vehicle within the schedule, and instead only assigned on the day of operations?
  • do we need a master list of vehicles that vehicle_id can pull from? [we also have add Vehicles.txt  #30, which currently doesn't have a vehicle_id field but should.]

one more q for driver_assignment.txt

  • would it make sense for this assignment to be made between an operator and a bid instead to individual runs? with the bid representing a package of runs/pieces during a specific window of time?
safrazier17 commented 11 months ago

Separating out the different discussion threads here:

skyqrose commented 11 months ago

Going back to comment on vehicles:

safrazier17 commented 11 months ago

Ok, that makes sense. As it stands in GTFS and ODS, blocks are defined in trips.block_id and/or deadheads.block_id, so even if a block doesn't have a assignment in vehicle_assignments.txt, we still can have a full definition of what that block's contents are. I'll mark Block with zero vehicles assigned above as resolved

timon-k commented 11 months ago

For the train use case, wouldn't we then also need multiple vehicles assigned per block (analogously to the multiple employees per run)?

It also could include a new (optional) field order_in_driving_direction so that the order of cars becomes clear if the provinding system knows it.

We can also ignore this for now, but I think the specification needs to make some statement on assignments of multiple vehicles to a single block, because technically, users will be able to do that in the proposed format. We should clarify expectations here (explicitly allow or disallow).

skyqrose commented 11 months ago

As mentioned in another issue, driver_assignments.txt is likely to end up containing more than drivers. staff_assignments.txt would be more generic.

skyqrose commented 11 months ago

If we want to allow multiple operators on the same run / multiple vehicles on the same block, maybe it's as simple as not enforcing any uniqueness guarantees on these files?

The order_in_driving_direction couldn't apply in this file, because a operator/vehicle is likely to be in a different order on different trips (like if a train turns around and the first car becomes the last). But maybe it could work in runs.txt?

timon-k commented 6 months ago

@safrazier17 @skyqrose This issue is currently marked as "Included" in ODS 2.0 here, but I don't see a final proposal of how to add the new files.

Also, the discussion about driver allocations was split off from this issue above, but there it states that the driver assignments should be in the new run_events.txt file - I cannot spot any reference to drivers or staff there. So it seems to me, that we still need to discuss both driver and vehicle assignments as part of this issue?

skyqrose commented 3 months ago

It's been a while, but this issue has come under discussion again for the rostering working group for TODS 2.1 this quarter. Lots of things have happened in this thread and in the rest of TODS, so here's a recap of where we are, with some new ideas and personal opinions mixed in:

Here's what I think this proposal is:

These two files record which vehicle/person is scheduled to do each block/run. They don't have to cover adjustments made after the schedule is made. They don't describe what's included in those blocks/runs (that's already in trips.txt and run_events.txt).

vehicle_assignments.txt:

Column Name Required? Description
date Required
service_id Optional Corresponds to the service_id that the block is on in trips.txt. Recommended if block_ids are repeated between different service_ids.

(Edit: This column added after discussion.)
block_id Required Corresponds to trips.txt:block_id
vehicle Required Might refer to a vehicle, a train, or a type of vehicle, depending on what happens with the vehicle-related proposals this quarter? Maybe we'd have separate vehicle_type and vehicle_id fields, where you must fill at least one of the columns? We'll need to sort out the details but how to refer to vehicles isn't the hard part of this proposal.

Uniqueness:

employee_assignments.txt

Column Name Required? Description
date Required
service_id Optional In run_events.txt, the unique ID for a run is (service_id, run_id), so this is recommended to make it clear which run the row refers to. It's may not be needed, because you might be able to look up a service_id via the calendar, or your run_ids might be unique even between days. But including it could help prevent errors. Probably needs more discussion.
run_id Required refers to run_events.txt. (Will require some discussion to resolve how this interacts with #76)
employee_id Optional Who's doing this run on this day? (If blank, then nobody's scheduled to do it.)

Not included: The type of job being performed is described in run_events.txt, not this file. There was also discussion of order_in_driving_direction, but that could change throughout a run, and if we need that, it'd be better to add it to run_events.

Uniqueness:

There was some discussion (comment by hodb) about "cutting" a run between different employees. If this was about scheduling an employee to a run on some dates but not every date, then that's already handled because this file would assign employees on one date at a time. If this was about scheduling multiple people to the same run on the same date, then what I've written above wouldn't work, and it'd need to change. @hodb, which case were you describing?

If we do need it, here are some other ideas for how to handle scheduling an employee to only part of a run on a day:

Edit: This was all about realtime adjustments to the roster, not the schedule, so is out of scope.

Comparison to #45

45 is another proposal about assigning people to runs. There are two main differences:

First, instead of employee ids, it gives a roster ids. The roster groups together runs that one person would do across multiple days, but doesn't say which person it's assigned to.

In the working group, we'll need to decide whether we want to include specific employees (as in this issue), the bid/roster (as in 45), or both. I lean towards employees being the right level of abstraction to include (I justify that here, in the 3rd section), but it depends on what uses people have for TODS.

Second, instead of working date-by-date, #45 works week-by-week, giving a monday schedule, a tuesday schedule, etc. This is similar to the difference between calendar.txt and calendard_dates.txt. We could work with either approach alone, or both together.

This file would require a lot of rows (one per run per day), and doing it week by week would be more compact. But this file also makes it easier to deal with vacations and irregular schedules, which I think is valuable given that employee's schedules are less likely to be regular than the service schedules in public GTFS.

We don't have to do only #28 or only #45. We could do both, or recombine the two in new ways, according to what information people want represented in TODS.

Summary

In summary, this issue is moving again, but there are still some major decisions to be made about which information and use cases are important. We'll discuss some of it in the working group meetings, but a lot of things are a lot easier to bring up in writing, so leave comments!

timon-k commented 3 months ago

@skyqrose Thanks for wrapping up the current state so thoroughly. The draft in this ticket looks good to me already, I just wanted to leave feedback on selected points:

hodb commented 2 months ago

Thanks for the detailed analysis @skyqrose! the use case I was referring to is the fact that a driver doesn't complete his scheduled run, and another driver would need to take care of it. Dispatchers may deal with unexpected situations during the day that can disrupt a trip or the driver’s day, malfunctioning vehicle, a driver getting sick during the work day, a bus breaking down in the middle of a trip, removing the workload of a driver, and such. We're not talking about multi-assignment but the ability to have modifications to the runs and blocks and some reference. It is fair to say that a duty/run can be "cut" and split between different drivers. In that case, we are creating a subset of the original duty performed by different drivers - For instance, the original duty - 123 was planned for driver A, but ended up being split between A and B. Having the cut means we may need a reference to the run_events.txt file to indicate this duty is done by different drivers

skyqrose commented 2 months ago

Ah, that makes sense, and it'd be nice to have a spec for that, but TODS so far has just been scheduled data, and I think the scope for now should just be to represent the runs as they're assigned in the schedule. In that case, then it seems like runs would never be scheduled to be split between multiple people.

Realtime changes to the schedule would then have to be in a future TODS-Realtime specification, similar to how GTFS-RT is a separate spec for realtime changes to the schedule in GTFS.

timon-k commented 2 months ago

Revising to my comment above: Perhaps we should omit service_id from the assignments.

If we would have it, it would make more sense to have it in both files (vehicles and employee assignments) to keep them consistent. If run IDs are not unique on a given date, then block IDs would perhaps also not be unique.

But it also does not make much sense to have multiple runs or blocks on the same date with the same ID. I think we can assume that real-world usage will have unique block and run IDs on every single date.

skyqrose commented 2 months ago

Two reasons to include service_id, given that runs are not unique on different services, even if they're unique within all services on the same date:

  1. It makes it easier to look up the run in run_events.txt. If the service isn't listed here, you have to look up in the calendar which services are active, and then find an entry in run_events.txt that matches the run_id and also happens on any one of those services.
  2. It could help prevent errors from mixing up runs on different days with the same id. I've had bugs in the past due to calendar exceptions not being applied correctly (holidays, track work, storm schedules), which led to looking at the wrong schedule (Weekday run 100 instead of Holiday run 100), and nonsense data. Always referring to a run as it's full (service_id, run_id) pair prevents those bugs.

And I think all this applies to blocks and vehicle_assignments as well.

timon-k commented 2 months ago

Makes sense, but then let's include the service_id in both assignment files?