MobilityData / gbfs

Documentation for the General Bikeshare Feed Specification, a standardized data feed for shared mobility system availability. Maintained by MobilityData
https://gbfs.org
Other
775 stars 282 forks source link

Future availability of all vehicles in the system #616

Open edwinvandenbelt opened 3 months ago

edwinvandenbelt commented 3 months ago

Who am I?

Edwin van den Belt, Software architect in the Netherlands, representing the TOMP-API.

What is the issue and why is it an issue?

We started creating the TOMP-API, using the GBFS definitions, in a restful way. But we extended some of these files with additional functionality, search functionality for availability in the future. This also required to publish all vehicles, without status information (in GBFS it shows only the available vehicles in the 'here-and-now'). It is mainly required for reservation purposes, but it cannot be handled by GBFS right now.

Please describe some potential solutions you have considered (even if they aren’t related to GBFS).

For more information, look at: https://github.com/TOMP-WG/TOMP-API/blob/master/documents/presentations/TOMP-API%20-%20GBFS%20-%20availability.pptx

Is your potential solution a breaking change?

matt-wirtz commented 2 months ago

This issue covers a very important topic for us as a provider of journey planning systems. Our goal is to only provide journey itineraries where all legs are available. For public transport legs that means that the trip e.g. is not canceled. For legs with shared vehicles that means that a vehicle is actually available. For sharing offers that utilize a time slot booking it is known when a vehicle is available even in the future. So for journey planning systems it's possible to only include travel options with shared vehicles if they are available at the desired time. Even if journeys are planned for next week. Right now GBFS only offers a single "available_until" parameter. And only for vehicles which are currently available. So if a vehicle would be available next week but not today it would not be considered as an option even if the user requests to travel next week.

testower commented 1 month ago

This seems to overlap with #612 , maybe you should join forces?

I'm definitely not in favor of extending the existing files with this information.

richfab commented 1 month ago

What does the community think about adding a new (optional) endpoint listing every vehicle_id in the system and their future availability?

futuretap commented 1 month ago

I do think there's potential to model (fixed) reservations in the future. One example for this is the CommonsBooking WordPress plugin. While it has rudimentary GBFS support (partly broken due to a timezone issue), the GBFS feed can only reflect the current status even though the system manages reservations in the future.

In general, reservations (days or weeks ahead into the future) are probably most common in the field of cars or cargo bikes, but it's still worth exploring, imo.

testower commented 1 month ago

It's a good point that there's a difference between forecasting availability based on statistical models, and having knowledge of future reservations, but they belong to the same broader category of information: future availability of vehicles. So they should at least be considered together on some level.

matt-wirtz commented 4 weeks ago

Hi everyone. I have created a first straight forward proposal. Please feel free to give feedback.

vehicle_availability

This endpoint would provide the current and future availability of all vehicles that operate in a time-slot-reservation mode. That means that bookings into the future of these vehicles is possible. Usually this mode is applied to station fixed vehicles like shared cars.

Field Name REQUIRED Type Defines
vehicles Yes Array Array of all vehicles.
vehicles[].id Yes ID Identifier of the vehicle
vehicles[].vehicle_type_id Yes ID The vehicle_type_id of this vehicle.
vehicles[].station_id Yes ID Identifier referencing the station_id field in station_information.json.
vehicles[].vehicle_equipment Optional Array List of vehicle equipment provided by the operator in addition to the accessories already provided in the vehicle (field vehicle_accessories of vehicle_types.json) but subject to more frequent updates.
vehicles[].pricing_plan_id Optional ID The plan_id of the pricing plan this vehicle is eligible for as described in system_pricing_plans.json.
vehicles[].availability[] Yes Array Array of all time periods where the vehicle is available for bookings.
vehicles[].availability[].from Yes Datetime Start time of availability time frame.
vehicles[].availability[].until Yes Datetime End time of availability time frame.

example

{
  "last_updated": "2024-12-24T13:34:13+02:00",
  "ttl": 0,
  "version": "3.x-RC",
  "data": {
    "vehicles": [
      {
        "vehicle_id": "45bd3fb7-a2d5-4def-9de1-c645844ba962",
        "vehicle_type_id": "abc123",
        "station_id": "market_place_A12",
        "vehicle_equipment": [
            "snow_chains"
        ],
        "pricing_plan_id": "fun_22"
        "availability": [
          { "from": "2024-12-24T18:20Z", "until": "2024-12-25T08:20Z" },
          { "from": "2024-12-25T10:20Z", "until": "2025-02-24T12:20Z" }
        ]
      }
    ]
  }
}

some remarks

I opted to include the availability time frames which seems easier to work with from the consumer side. Might look different for the producer where the not_available times - the booked times - might be easier to be provided. Rental URIs are not included since the desired time of the booking is of course not known by the sharing operator. Missing are any filter options and paging as requested in the issue. That would need to be added when the endpoint becomes part of a e.g. restful API.

futuretap commented 4 weeks ago

Thanks for the proposal!

I question the duplication of fields from the vehicle_status feed such as vehicle_type_id, station_id, vehicle_equipment, and pricing_plan_id. I'd suggest a pure availability feed instead. The station id might be a special case since there's the possibility of vehicles moving stations which is planned in advance (e.g. CommonsBooking allows this). So station_id should be an optional field in an availability slot object.

Ideally, the availability feed should optionally be filterable either by vehicle_id or by timeframe. I assume, a common use case will be to fetch availability info for a specific vehicle when looking at its details or to see availabilities vor all vehicles at a specific future point in time.

matt-wirtz commented 4 weeks ago

Yes, initially I didn't include vehicle_type_id, station_id, vehicle_equipment, and pricing_plan_id. But how could you get those parameters in case the vehicle is right now not available for rental? In that case it would not show up in vehicle_status because only available vehicles are listed in here.

That's why I included these parameters here too.

Regarding the filtering. I agree with you but would add that station_id and vehicle_type_id might be useful too. In car sharing use cases it's very common to only look for a specific type of vehicle like "van". If you are looking for a van you are typically not interested in small sized cars.

futuretap commented 4 weeks ago

You have a point. However, we then need all fields including lat/lon, rental_uris etc. Then this feed would be an extended version of the vehicle_status feed and consumers would no longer fetch the vehicle_status feed when they're interested in future availability; they'd fetch vehicle_availability instead.

richfab commented 2 weeks ago

From speaking with operators, I anticipate that many of them will not wish to publish the total number of vehicles per type they have in their fleet in open data due to competition.

To avoid this, a solution could be to extend vehicle_types.json and indicate when at least one vehicle of that type is available at a given location. This would also avoid duplication of fields. We would lose vehicle_equipment but maybe it's a compromise we can accept.

Example (vehicle_types.json):

{
  "last_updated": "2023-07-17T13:34:13+02:00",
  "ttl": 0,
  "version": "3.x-RC",
  "data": {
    "vehicle_types": [
      {
        "vehicle_type_id": "abc123",
        "availability": [
          { 
            "station_id": "market_place_A12", //dock based systems
            "availability_windows": [ //intervals can overlap
              {"from": "2024-12-24T08:15Z", "until": "2024-12-24T09:15Z" },
              {"from": "2024-12-24T08:45Z", "until": "2024-12-24T10:00Z" }
            ]
          },
          { 
            "lat": 12.345678, //dockless systems
            "lon": 56.789012,
            "availability_windows": [ { "from": "2024-12-25T10:20Z", "until": "2025-02-24T12:20Z" } ]
          }
        ]
        "default_pricing_plan_id": "bike_plan_1",
        "pricing_plan_ids": [
          "bike_plan_1",
          "bike_plan_2",
          "bike_plan_3"
        ]
      }
     other vehicle types..
    ]
  }
}

Looking forward to hearing your feedback!

testower commented 2 weeks ago

I think @richfab is onto something I had in mind but couldn't quite articulate. I think availability at the vehicle_type level is probably the best approach, but I would still have in a separate dedicated file, such as vehicle_availability.json. After all, are users interested in the availability of a specific vehicle, or the availability of a vehicle with certain features. I suspect the latter.

matt-wirtz commented 2 weeks ago

@richfab Thanks for the input. From your example I can not clearly see if the availability_windows are distinct time intervals or overlapping ones. In the former case I don't think that this is working.

As an example the existing booked (x) and available (0) times of two vehicles at one station in 15 minutes time windows: vehicle 08:00 08:15 08:30 08:45 09:00 09:15 09:30 09:45 10:00 10:15 10:30
v1 X X X 0 0 0 0 0 X X X
v2 X 0 0 0 0 X X X X X 0

The availability_windows probably are then:

But it's not possible to book a vehicle from 8:15am to 9:45am.

In the latter case with overlapping availability_windows the windows would look like this:

From this one could probably infer back to the number of vehicles the operator has in service at this station. At least get a approximation for it. So the intent to hide that information is at least to some extent undermined.

For right now I'm leaving out the question if the overlapping time windows are usable for trip planning systems or end user apps. But I would like to question if the future availability information needs to be open data in the sense of openly accessible, exploitable, editable and shared by anyone for any purpose. I don't think that this should be a requirement.

If a producer doesn't wish to publish it as open data - which I think is perfectly understandable - it can be shared only with trusted partners. Maybe with a multimodal booking app. If booking vehicles via this app is possible then a contractual agreement will probably be necessary anyhow. And on this bases the sharing of the future availability information could be regulated too.

mplsmitch commented 2 weeks ago

But I would like to question if the future availability information needs to be open data in the sense of openly accessible, exploitable, editable and shared by anyone for any purpose. I don't think that this should be a requirement.

Public data has always been a guiding principle of the specification. If access to the data requires a contractual agreement, then by definition it's not public. Were this not a foundational part of the project, we would have included bookings/reservations/payments/persistent IDs etc. My fear is that if we introduce a new file that's not public (requires auth), then the providers will simply require auth for all the files in the feed. If there's a way that we can serve this use case out in the open, I'm all for it. If it requires restricting access to the data then I'm not in favor.

Doesn't TOMP already do this?

edwinvandenbelt commented 2 weeks ago

Hello Mitch,

Indeed, but the TOMP has something for this, but we're investigating how to reduce the overlap with GBFS, to make them work together. If we can hand this part over to GBFS, 'separation of concerns' is applied. GBFS for data, e.g. published for planning and the TOMP can start when planning/booking is in scope.

How about standardising the future availability within GBFS, and not specifying that this is a non-public file? To me it looks quite simular: the current and future availability. Why is the current availability public and why should the future availability be non-public? If an implementating party doesn't want to publish it for the general public, so be it. That's also the current situation with the available vehicles dataset. Some parties don't want to expose it publically.

Regards, Edwin


Van: Mitch Vars @.> Verzonden: donderdag 20 juni 2024 15:05 Aan: MobilityData/gbfs @.> CC: Edwin van den Belt @.>; Author @.> Onderwerp: Re: [MobilityData/gbfs] Future availability of all vehicles in the system (Issue #616)

But I would like to question if the future availability information needs to be open data in the sense of openly accessible, exploitable, editable and shared by anyone for any purpose. I don't think that this should be a requirement.

Public data has always been a guiding principle of the specification. If access to the data requires a contractual agreement, then by definition it's not public. Were this not a foundational part of the project, we would have included bookings/reservations/payments/persistent IDs etc. My fear is that if we introduce a new file that's not public (requires auth), then the providers will simply require auth for all the files in the feed. If there's a way that we can serve this use case out in the open, I'm all for it. If it requires restricting access to the data then I'm not in favor.

Doesn't TOMP already do this?

— Reply to this email directly, view it on GitHubhttps://github.com/MobilityData/gbfs/issues/616#issuecomment-2180633571, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACPLCNX7B5RRDXXYXF356ODZILHQPAVCNFSM6AAAAABFKNQ6ZWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBQGYZTGNJXGE. You are receiving this because you authored the thread.Message ID: @.***>

richfab commented 5 days ago

Hello everyone,

@testower Regarding representing this information in the existing file vehicle_types.json or in a separate file (eg: vehicle_type_availability.json), I don't have a strong preference.

@matt-wirtz Excellent point about overlapping time intervals (a given vehicle must be available for the entire duration of the reservation). I modified my example to show overlapping intervals.

one could probably infer back to the number of vehicles the operator has in service at this station. At least get a approximation for it

I think that the approximation of the total number of vehicles per type in the fleet from time interval overlaps is not precise enough to be usable by the competitors.

Indeed, in the example with the 3 time intervals, one can only deduce that the station has a minimum of 2 vehicles, but it can have more.

With the same time intervals as in your example...

  • "from": "T08:15Z", "until": "09:15Z"
  • "from": "T08:45Z", "until": "10:00Z"
  • "from": "T10:30Z", "until": "10:45Z"

... the station could have 2+n vehicles:

vehicle 08:00 08:15 08:30 08:45 09:00 09:15 09:30 09:45 10:00 10:15 10:30
v1 X X X 0 0 0 0 0 X X X
v2 X 0 0 0 0 X X X X X 0
v3 X X 0 0 X X X X X X X
v4 X X X X X X X X X X X

Thoughts?

matt-wirtz commented 4 days ago

@richfab To think about it more thoroughly I think it's necessary to understand how the availability_windows are calculated. Let me try it:

Maybe there is a simpler description of how to do it but let's first check if this exactly describes your idea.

richfab commented 4 days ago

Thanks for clarifying @matt-wirtz.

If I'm not mistaken, the table of stations with 4 vehicles matches your description of availability_windows:

My intention is to show that the approximation of the total number of vehicles per type at a given station is not precise enough to be usable by the competitors (from the same set of time intervals, the station could have 2, 3, 4, n vehicles). So this aggregated data can be opened without causing problems for operators, in my opinion.

Please let me know if I misunderstood anything. Thanks!