Future availability of all vehicles in the system

edwinvandenbelt commented 7 months ago

Who am I?

Edwin van den Belt, Software architect in the Netherlands, representing the TOMP-API.

What is the issue and why is it an issue?

We started creating the TOMP-API, using the GBFS definitions, in a restful way. But we extended some of these files with additional functionality, search functionality for availability in the future. This also required to publish all vehicles, without status information (in GBFS it shows only the available vehicles in the 'here-and-now'). It is mainly required for reservation purposes, but it cannot be handled by GBFS right now.

Please describe some potential solutions you have considered (even if they aren’t related to GBFS).

extend the GBFS file vehicle_status with an array of available time slots (but in this case, you have to show all bikes)
use a restful API (described using OpenAPI spec) as an 'add-on' on the vehicles, exposing all vehicles, but allowing multiple filters as well to cope with large data sets, but also include paging (see #617).

For more information, look at: https://github.com/TOMP-WG/TOMP-API/blob/master/documents/presentations/TOMP-API%20-%20GBFS%20-%20availability.pptx

Is your potential solution a breaking change?

[ ] Yes
[ ] No
[X] Unsure, it is an add-on

matt-wirtz commented 6 months ago

This issue covers a very important topic for us as a provider of journey planning systems. Our goal is to only provide journey itineraries where all legs are available. For public transport legs that means that the trip e.g. is not canceled. For legs with shared vehicles that means that a vehicle is actually available. For sharing offers that utilize a time slot booking it is known when a vehicle is available even in the future. So for journey planning systems it's possible to only include travel options with shared vehicles if they are available at the desired time. Even if journeys are planned for next week. Right now GBFS only offers a single "available_until" parameter. And only for vehicles which are currently available. So if a vehicle would be available next week but not today it would not be considered as an option even if the user requests to travel next week.

testower commented 5 months ago

This seems to overlap with #612 , maybe you should join forces?

I'm definitely not in favor of extending the existing files with this information.

richfab commented 5 months ago

What does the community think about adding a new (optional) endpoint listing every vehicle_id in the system and their future availability?

futuretap commented 5 months ago

I do think there's potential to model (fixed) reservations in the future. One example for this is the CommonsBooking WordPress plugin. While it has rudimentary GBFS support (partly broken due to a timezone issue), the GBFS feed can only reflect the current status even though the system manages reservations in the future.

In general, reservations (days or weeks ahead into the future) are probably most common in the field of cars or cargo bikes, but it's still worth exploring, imo.

testower commented 5 months ago

It's a good point that there's a difference between forecasting availability based on statistical models, and having knowledge of future reservations, but they belong to the same broader category of information: future availability of vehicles. So they should at least be considered together on some level.

matt-wirtz commented 5 months ago

Hi everyone. I have created a first straight forward proposal. Please feel free to give feedback.

vehicle_availability

This endpoint would provide the current and future availability of all vehicles that operate in a time-slot-reservation mode. That means that bookings into the future of these vehicles is possible. Usually this mode is applied to station fixed vehicles like shared cars.

Field Name	REQUIRED	Type	Defines
vehicles	Yes	Array	Array of all vehicles.
vehicles[].id	Yes	ID	Identifier of the vehicle
vehicles[].vehicle_type_id	Yes	ID	The vehicle_type_id of this vehicle.
vehicles[].station_id	Yes	ID	Identifier referencing the station_id field in station_information.json.
vehicles[].vehicle_equipment	Optional	Array	List of vehicle equipment provided by the operator in addition to the accessories already provided in the vehicle (field vehicle_accessories of vehicle_types.json) but subject to more frequent updates.
vehicles[].pricing_plan_id	Optional	ID	The plan_id of the pricing plan this vehicle is eligible for as described in system_pricing_plans.json.
vehicles[].availability[]	Yes	Array	Array of all time periods where the vehicle is available for bookings.
vehicles[].availability[].from	Yes	Datetime	Start time of availability time frame.
vehicles[].availability[].until	Yes	Datetime	End time of availability time frame.

example

{
  "last_updated": "2024-12-24T13:34:13+02:00",
  "ttl": 0,
  "version": "3.x-RC",
  "data": {
    "vehicles": [
      {
        "vehicle_id": "45bd3fb7-a2d5-4def-9de1-c645844ba962",
        "vehicle_type_id": "abc123",
        "station_id": "market_place_A12",
        "vehicle_equipment": [
            "snow_chains"
        ],
        "pricing_plan_id": "fun_22"
        "availability": [
          { "from": "2024-12-24T18:20Z", "until": "2024-12-25T08:20Z" },
          { "from": "2024-12-25T10:20Z", "until": "2025-02-24T12:20Z" }
        ]
      }
    ]
  }
}

some remarks

I opted to include the availability time frames which seems easier to work with from the consumer side. Might look different for the producer where the not_available times - the booked times - might be easier to be provided. Rental URIs are not included since the desired time of the booking is of course not known by the sharing operator. Missing are any filter options and paging as requested in the issue. That would need to be added when the endpoint becomes part of a e.g. restful API.

futuretap commented 5 months ago

Thanks for the proposal!

I question the duplication of fields from the vehicle_status feed such as vehicle_type_id, station_id, vehicle_equipment, and pricing_plan_id. I'd suggest a pure availability feed instead. The station id might be a special case since there's the possibility of vehicles moving stations which is planned in advance (e.g. CommonsBooking allows this). So station_id should be an optional field in an availability slot object.

Ideally, the availability feed should optionally be filterable either by vehicle_id or by timeframe. I assume, a common use case will be to fetch availability info for a specific vehicle when looking at its details or to see availabilities vor all vehicles at a specific future point in time.

matt-wirtz commented 5 months ago

Yes, initially I didn't include vehicle_type_id, station_id, vehicle_equipment, and pricing_plan_id. But how could you get those parameters in case the vehicle is right now not available for rental? In that case it would not show up in vehicle_status because only available vehicles are listed in here.

That's why I included these parameters here too.

Regarding the filtering. I agree with you but would add that station_id and vehicle_type_id might be useful too. In car sharing use cases it's very common to only look for a specific type of vehicle like "van". If you are looking for a van you are typically not interested in small sized cars.

futuretap commented 5 months ago

You have a point. However, we then need all fields including lat/lon, rental_uris etc. Then this feed would be an extended version of the vehicle_status feed and consumers would no longer fetch the vehicle_status feed when they're interested in future availability; they'd fetch vehicle_availability instead.

richfab commented 4 months ago

From speaking with operators, I anticipate that many of them will not wish to publish the total number of vehicles per type they have in their fleet in open data due to competition.

To avoid this, a solution could be to extend vehicle_types.json and indicate when at least one vehicle of that type is available at a given location. This would also avoid duplication of fields. We would lose vehicle_equipment but maybe it's a compromise we can accept.

Example (vehicle_types.json):

{
  "last_updated": "2023-07-17T13:34:13+02:00",
  "ttl": 0,
  "version": "3.x-RC",
  "data": {
    "vehicle_types": [
      {
        "vehicle_type_id": "abc123",
        "availability": [
          { 
            "station_id": "market_place_A12", //dock based systems
            "availability_windows": [ //intervals can overlap
              {"from": "2024-12-24T08:15Z", "until": "2024-12-24T09:15Z" },
              {"from": "2024-12-24T08:45Z", "until": "2024-12-24T10:00Z" }
            ]
          },
          { 
            "lat": 12.345678, //dockless systems
            "lon": 56.789012,
            "availability_windows": [ { "from": "2024-12-25T10:20Z", "until": "2025-02-24T12:20Z" } ]
          }
        ]
        "default_pricing_plan_id": "bike_plan_1",
        "pricing_plan_ids": [
          "bike_plan_1",
          "bike_plan_2",
          "bike_plan_3"
        ]
      }
     other vehicle types..
    ]
  }
}

Looking forward to hearing your feedback!

testower commented 4 months ago

I think @richfab is onto something I had in mind but couldn't quite articulate. I think availability at the vehicle_type level is probably the best approach, but I would still have in a separate dedicated file, such as vehicle_availability.json. After all, are users interested in the availability of a specific vehicle, or the availability of a vehicle with certain features. I suspect the latter.

matt-wirtz commented 4 months ago

@richfab Thanks for the input. From your example I can not clearly see if the availability_windows are distinct time intervals or overlapping ones. In the former case I don't think that this is working.

As an example the existing booked (x) and available (0) times of two vehicles at one station in 15 minutes time windows:	vehicle	08:00	08:15	08:30	08:45	09:00	09:15	09:30	09:45	10:00	10:15	10:30
v1	X	X	X	0	0	0	0	0	X	X	X
v2	X	0	0	0	0	X	X	X	X	X	0

The availability_windows probably are then:

"from": "T08:15Z", "until": "10:00Z"
"from": "T10:30Z", "until": "10:45Z"

But it's not possible to book a vehicle from 8:15am to 9:45am.

In the latter case with overlapping availability_windows the windows would look like this:

"from": "T08:15Z", "until": "09:15Z"
"from": "T08:45Z", "until": "10:00Z"
"from": "T10:30Z", "until": "10:45Z"

From this one could probably infer back to the number of vehicles the operator has in service at this station. At least get a approximation for it. So the intent to hide that information is at least to some extent undermined.

For right now I'm leaving out the question if the overlapping time windows are usable for trip planning systems or end user apps. But I would like to question if the future availability information needs to be open data in the sense of openly accessible, exploitable, editable and shared by anyone for any purpose. I don't think that this should be a requirement.

If a producer doesn't wish to publish it as open data - which I think is perfectly understandable - it can be shared only with trusted partners. Maybe with a multimodal booking app. If booking vehicles via this app is possible then a contractual agreement will probably be necessary anyhow. And on this bases the sharing of the future availability information could be regulated too.

mplsmitch commented 4 months ago

But I would like to question if the future availability information needs to be open data in the sense of openly accessible, exploitable, editable and shared by anyone for any purpose. I don't think that this should be a requirement.

Public data has always been a guiding principle of the specification. If access to the data requires a contractual agreement, then by definition it's not public. Were this not a foundational part of the project, we would have included bookings/reservations/payments/persistent IDs etc. My fear is that if we introduce a new file that's not public (requires auth), then the providers will simply require auth for all the files in the feed. If there's a way that we can serve this use case out in the open, I'm all for it. If it requires restricting access to the data then I'm not in favor.

Doesn't TOMP already do this?

edwinvandenbelt commented 4 months ago

Hello Mitch,

Indeed, but the TOMP has something for this, but we're investigating how to reduce the overlap with GBFS, to make them work together. If we can hand this part over to GBFS, 'separation of concerns' is applied. GBFS for data, e.g. published for planning and the TOMP can start when planning/booking is in scope.

How about standardising the future availability within GBFS, and not specifying that this is a non-public file? To me it looks quite simular: the current and future availability. Why is the current availability public and why should the future availability be non-public? If an implementating party doesn't want to publish it for the general public, so be it. That's also the current situation with the available vehicles dataset. Some parties don't want to expose it publically.

Regards, Edwin

Van: Mitch Vars @.> Verzonden: donderdag 20 juni 2024 15:05 Aan: MobilityData/gbfs @.> CC: Edwin van den Belt @.>; Author @.> Onderwerp: Re: [MobilityData/gbfs] Future availability of all vehicles in the system (Issue #616)

But I would like to question if the future availability information needs to be open data in the sense of openly accessible, exploitable, editable and shared by anyone for any purpose. I don't think that this should be a requirement.

Public data has always been a guiding principle of the specification. If access to the data requires a contractual agreement, then by definition it's not public. Were this not a foundational part of the project, we would have included bookings/reservations/payments/persistent IDs etc. My fear is that if we introduce a new file that's not public (requires auth), then the providers will simply require auth for all the files in the feed. If there's a way that we can serve this use case out in the open, I'm all for it. If it requires restricting access to the data then I'm not in favor.

Doesn't TOMP already do this?

— Reply to this email directly, view it on GitHubhttps://github.com/MobilityData/gbfs/issues/616#issuecomment-2180633571, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACPLCNX7B5RRDXXYXF356ODZILHQPAVCNFSM6AAAAABFKNQ6ZWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBQGYZTGNJXGE. You are receiving this because you authored the thread.Message ID: @.***>

richfab commented 4 months ago

Hello everyone,

@testower Regarding representing this information in the existing file vehicle_types.json or in a separate file (eg: vehicle_types_availability.json), I don't have a strong preference.

@matt-wirtz Excellent point about overlapping time intervals (a given vehicle must be available for the entire duration of the reservation). I modified my example to show overlapping intervals.

one could probably infer back to the number of vehicles the operator has in service at this station. At least get a approximation for it

I think that the approximation of the total number of vehicles per type in the fleet from time interval overlaps is not precise enough to be usable by the competitors.

Indeed, in the example with the 3 time intervals, one can only deduce that the station has a minimum of 2 vehicles, but it can have more.

With the same time intervals as in your example...

"from": "T08:15Z", "until": "09:15Z"

"from": "T08:45Z", "until": "10:00Z"

"from": "T10:30Z", "until": "10:45Z"

... the station could have 2+n vehicles:

vehicle	08:00	08:15	08:30	08:45	09:00	09:15	09:30	09:45	10:00	10:15	10:30
v1	X	X	X	0	0	0	0	0	X	X	X
v2	X	0	0	0	0	X	X	X	X	X	0
v3	X	X	0	0	X	X	X	X	X	X	X
v4	X	X	X	X	X	X	X	X	X	X	X

Thoughts?

matt-wirtz commented 4 months ago

@richfab To think about it more thoroughly I think it's necessary to understand how the availability_windows are calculated. Let me try it:

for each vehicle type and station:
- for each vehicle calculate all time windows in which this vehicle is available
- remove all time windows which are completely covered by at least one other time window (completely covered means: start time is greater or equal and end time is smaller or equal compared to the encompassing time window)
- the remaining time windows are the availability_windows

Maybe there is a simpler description of how to do it but let's first check if this exactly describes your idea.

richfab commented 4 months ago

Thanks for clarifying @matt-wirtz.

If I'm not mistaken, the table of stations with 4 vehicles matches your description of availability_windows:

Availability of v3 is fully covered by availability of v2
v4 is never available so it is invisible in the set of time intervals

My intention is to show that the approximation of the total number of vehicles per type at a given station is not precise enough to be usable by the competitors (from the same set of time intervals, the station could have 2, 3, 4, n vehicles). So this aggregated data can be opened without causing problems for operators, in my opinion.

Please let me know if I misunderstood anything. Thanks!

tobsesHub commented 4 months ago

One question. Why is there no endpoint like all_vehicles.json, which would include general information like equipment and pricing plan id, of course no coordinates?

So it could be something like the station_information file, which only contains static information and includes the vehicles that are available.

Such a file could also prevent duplicates.

Maybe I'm missing something. I want to understand it.

richfab commented 4 months ago

Hi @tobsesHub,

Great question!

The information such as equipment and pricing plan about available vehicles, can be found in the vehicle_status.json file (formerly free_bike_status.json). See reference.

However, the information about unavailable vehicles is not shown because it is of little interest to travelers.

Feel free to share use cases where information about unavailable vehicles would be useful to travelers.

Thank you!

edwinvandenbelt commented 4 months ago

Sorry to intervene, Fabien.

This is exactly the reason why we addressed this issue. We want to publish information about vehicles that might be in rent now, but will be available in the (near) future. We don't want to have the exact state of these, but if I, as a traveller want to hire a cargo bike this afternoon, or a bike to visit my grandma, I have to trust nowadays that there is something available. Especially for sparsely available vehicles (like cargo bikes, or bikes located at a bungalowpark) this is relevant. So, just the 'static info' about the bike should be sufficient, including a reference we can use to use in a booking process.

Regards, Edwin

Van: Fabien Richard-Allouard @.> Verzonden: woensdag 10 juli 2024 11:24 Aan: MobilityData/gbfs @.> CC: Edwin van den Belt @.>; Author @.> Onderwerp: Re: [MobilityData/gbfs] Future availability of all vehicles in the system (Issue #616)

Hi @tobsesHubhttps://github.com/tobsesHub,

Great question!

The information such as equipment and pricing plan about available vehicles, can be found in the vehicle_status.json file (formerly free_bike_status.json). See referencehttps://github.com/MobilityData/gbfs/blob/master/gbfs.md#vehicle_statusjson.

However, the information about unavailable vehicles is not shown because it is of little interest to travelers.

Feel free to share use cases where information about unavailable vehicles would be useful to travelers.

Thank you!

— Reply to this email directly, view it on GitHubhttps://github.com/MobilityData/gbfs/issues/616#issuecomment-2219999436, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACPLCNX3SOOA3EY6XL2SPVTZLT4WPAVCNFSM6AAAAABFKNQ6ZWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJZHE4TSNBTGY. You are receiving this because you authored the thread.

richfab commented 4 months ago

Hi all,

@edwinvandenbelt Good news, the proposal under discussion addresses the need to know if at least one vehicle of a certain type is available in the future.

Please find below an iteration on @matt-wirtz's proposal which should address the initial need, the availability windows constraints and open data concerns.

Example (new optional file: vehicle_type_availability.json):

{
  "last_updated": "2023-07-17T13:34:13+02:00",
  "ttl": 0,
  "version": "3.x-RC",
  "data": {
    "vehicle_types_availability": [
      {
        "vehicle_type_id": "abc123",
        "availability": [
          { 
            "station_id": "market_place_A12", //dock based systems
            "availability_windows": [ //intervals can overlap
              {"from": "2024-12-24T08:15Z", "until": "2024-12-24T09:15Z" },
              {"from": "2024-12-24T08:45Z", "until": "2024-12-24T10:00Z" }
            ]
          },
        ]
      },
      {
        "vehicle_type_id": "def456",
        "availability": [
          { 
            "area": { //dockless systems
              "type": "MultiPolygon",
              "coordinates": [
                [
                  [
                    [
                      2.34266,
                      48.85184
                    ],
                    [
                      2.34266,
                      48.84635
                    ],
                    [
                      2.34991,
                      48.84635
                    ],
                    [
                      2.34991,
                      48.85184
                    ],
                    [
                      2.34266,
                      48.85184
                    ]
                  ]
                ]
              ]
            },
            "availability_windows": [ //intervals can overlap
              {"from": "2024-12-24T08:15Z", "until": "2024-12-24T09:15Z" },
              {"from": "2024-12-24T08:45Z", "until": "2024-12-24T10:00Z" }
            ]
          }
        ]
      }
    ]
  }
}

We could consider using the same file to represent availability forecasts (https://github.com/MobilityData/gbfs/issues/612).

Looking forward to hearing your feedback!

tobsesHub commented 4 months ago

@richfab I like that soulution. Yes, we could probably add an optional probability field to unify this problem and the https://github.com/MobilityData/gbfs/issues/612 issue.

matt-wirtz commented 3 months ago

Looking at the availability_windows approach from the perspective of a trip planning system the data structure could be used for most scenarios. For use cases e.g. where the user is interested in renting more than one vehicle, e.g. when looking for two bikes, it's not working.

When thinking about the computational effort for calculating the availability_windows every couple of minutes one would rather try to avoid this. Same for the pushed-based model suggested in https://github.com/MobilityData/gbfs/issues/630: if a booking is modified the modification can't just be propagated. First it has to be calculated if the availability_windows are affected at all by this modification. Of course solvable but to what benefit.

So I would rather opt for an optional dedicated vehicle_availability endpoint.

matt-wirtz commented 3 months ago

Hi @richfab. You mentioned earlier in this issue that "you anticipate that many of them [sharing operators] will not wish to publish the total number of vehicles per type they have in their fleet in open data due to competition".

Have you already had the chance to get a feedback from operators if the availability_windows approach would dispel the concerns you anticipate with the original approach?

richfab commented 2 months ago

Hi @matt-wirtz, Thank you for following up on this. I have not received any feedback from operators on whether the aggregated data approach would be better than the original approach or not. One way to find out would be to ask carshare operators, for example via a survey. Please note that MobilityData does not have the capacity to conduct such a survey before Q4 2024.

mobilitydataio commented 4 weeks ago

This discussion has been automatically marked as stale because it has not had recent activity. It will be closed in 30 days if no further activity occurs. Thank you for your contributions.

Lefois commented 1 week ago

Keep open

MobilityData / gbfs