opentripplanner / OpenTripPlanner

An open source multi-modal trip planner
http://www.opentripplanner.org
Other
2.23k stars 1.04k forks source link

GBFS: java.lang.NullPointerException while running polling updater - "org.entur.gbfs.v2_3.system_information.GBFSData.getTimezone()" is null #6181

Open Oxalin opened 1 month ago

Oxalin commented 1 month ago

Observed behavior

The updater is unable to retrieve the proper timezone. getTimezone() returns null and an error is encountered when polling the updater:

opentripplanner | 22:02:52.321 ERROR [updater-3] (PollingGraphUpdater.java:68) Error while running polling updater VehicleRentalUpdater{source: GbfsVehicleRentalDataSource{url: 'https://gbfs.velobixi.com/gbfs/gbfs.json'}} opentripplanner | java.lang.NullPointerException: Cannot invoke "org.entur.gbfs.v2_3.system_information.GBFSData$Timezone.value()" because the return value of "org.entur.gbfs.v2_3.system_information.GBFSData.getTimezone()" is null opentripplanner | at org.opentripplanner.updater.vehicle_rental.datasources.GbfsSystemInformationMapper.mapSystemInformation(GbfsSystemInformationMapper.java:45) opentripplanner | at org.opentripplanner.updater.vehicle_rental.datasources.GbfsVehicleRentalDataSource.getUpdates(GbfsVehicleRentalDataSource.java:63) opentripplanner | at org.opentripplanner.updater.vehicle_rental.VehicleRentalUpdater.runPolling(VehicleRentalUpdater.java:117) opentripplanner | at org.opentripplanner.updater.spi.PollingGraphUpdater.run(PollingGraphUpdater.java:52) opentripplanner | at org.opentripplanner.updater.GraphUpdaterManager.lambda$startUpdaters$0(GraphUpdaterManager.java:105) opentripplanner | at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) opentripplanner | at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source) opentripplanner | at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) opentripplanner | at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) opentripplanner | at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) opentripplanner | at java.base/java.lang.Thread.run(Unknown Source)

This issue is related to some discussions that took place in issue 6066.

Expected behavior

The updater should properly detect the timezone and return the proper value for "America/Montreal".

Version of OTP used (exact commit hash or JAR name)

2.4.0

Data sets in use (links to GTFS and OSM PBF files)

The following feeds are being used and they all generate the same error: Bixi Montréal: https://gbfs.velobixi.com/gbfs/gbfs.json (validated, from which the error above is taken) àVélo Québec : https://quebec.publicbikesystem.net/customer/gbfs/v2/gbfs.json (errors detected unrelated to timezone)

The two of them have "America/Montreal" as their timezone. "America/Montreal" is linked to "America/Toronto" according to this Wiki page

The Bixi Montréal feed is properly validated using the MobilityData's GBFS validator with no complains about the timezone, while the two other have errors but that are unrelated to the timezone. Thus, OTP should properly resolve the timezone.

Router config and graph build config JSON

{ "routingDefaults" : { "maxDirectStreetDuration" : "PT12H", "maxDirectStreetDurationForMode": { "CAR": "PT24H" }, "numItineraries" : 5 }, "updaters" : [ { "type" : "vehicle-rental", "sourceType" : "gbfs", "frequency" : "1m", "url" : "https://gbfs.velobixi.com/gbfs/gbfs.json" }, { "type" : "vehicle-rental", "sourceType" : "gbfs", "frequency" : "1m", "url" : "https://quebec.publicbikesystem.net/customer/gbfs/v2/gbfs.json" }, { "type" : "vehicle-rental", "sourceType" : "gbfs", "frequency" : "1m", "url" : "https://saguenay.publicbikesystem.net/customer/gbfs/v2/gbfs.json" } ] }

Steps to reproduce the problem

Add the GBFS feeds in the updater and launch OTP.

Oxalin commented 1 month ago

According to this link (https://docs.oracle.com/cd/E72987_01/wcs/tag-ref/MISC/TimeZones.html), "America/Montreal" is a supported timezone in Java.

Also, I'd like to point out that the IANA.org propose a backward file for link cases where it is stated: # Although this file is optional and tzdb will work if you omit it by # building with 'make BACKWARD=', in practice downstream users # typically use this file for backward compatibility.

Thus, while backward timezone is optional, it is usually supported and one could be expecting OTP to proprely handle this situation.

testower commented 1 month ago

GBFS in OTP is parsed into https://github.com/MobilityData/gbfs-json-schema/tree/master/models/java, which is generated by the JSON schema in the same repository. Only canonical timezones are listed in the schema, not backward/links.

testower commented 1 month ago

I see I failed to adress the topic of the validator: The validator doesn't complain about timezones in https://gbfs.velobixi.com/gbfs/gbfs.json because the validator assume they are v1.0 (because no version field is present in the feeds). The v1.0 json schema simply validates the timezone field as a string.

leonardehrenfried commented 1 month ago

I should add that the main reason that links/aliases are not allowed by the validator and OTP is that the Java runtime doesn't support them either. We would have to write our own library for that.

Btw, your third example uses America/Toronto: https://saguenay.publicbikesystem.net/customer/gbfs/v2/en/system_information

@testower While the validator is very clear, the gbfs spec doesn't say anything about links/aliases. Should we change this? This is the second issue we had a out this in one month.

testower commented 1 month ago

@leonardehrenfried As far as I can tell the links/aliases are not in the IANA database files, so it's already in the spec (albeit implicitly). I can't even find any mention of the concept of a link or an alias there, although several sources seem to suggest they should contain it, e.g. https://www.w3.org/TR/timezone/#tzids

Oxalin commented 1 month ago

I think that we can't expect every GBFS providers to know or properly set only canonical timezones. For historic reasons that may not be so far in time, some canonical timezones have been replaced but may still be commonly used. America/Montreal is in such a case. In the same way, other changes could happen in the future leading timezones from being canonical to link/aliases.

@testower : I'm not sure I'm following you when you say IANA database files don't provide linked timezones. They are listed under tzdb/backward and are suggested to be included when building the tzdb. That being said, I understand that the validator doesn't look at the actual value for V1.0. It does flag it as wrong with àVélo Québec as expected.

Oxalin commented 1 month ago

@leonardehrenfried when you say that the Java runtime doesn't support them (aliases/links) either, where can I consult the list? I don't know Java very well, but my understanding is that the DB changes from release to release (https://www.oracle.com/java/technologies/tzdata-versions.html).

If the validator (not in Java) and the Java runtime use the same TZ DB, some timezones seem to be links/aliases (CST6CDT for example, found on the validator's side).

leonardehrenfried commented 1 month ago

We have the validator so that you can find out if your feed is valid or not, including time zones. However, the feeds you list don't say which version they are. If you tell the validator that they are version 2 (which is assume they are) then you will see that the time zone, along with many other properties are invalid.

leonardehrenfried commented 1 month ago

I don't have the complete list that Java knows at hand, but I believe it's all those which are marked as canonical from https://en.m.wikipedia.org/wiki/List_of_tz_database_time_zones

testower commented 1 month ago

Just to avoid any confusion, reiterating some concepts here:

The GBFS specification states the following:

Timezone - TZ timezone from the https://www.iana.org/time-zones. Timezone names never contain the space character but MAY contain an underscore. Refer to https://en.wikipedia.org/wiki/List_of_tz_zones for a list of valid values. Example: Asia/Tokyo, America/Los_Angeles or Africa/Cairo.

I think this text is not unambiguous as to whether only canonical names are included.

The official GBFS JSON schema defines timezone in v1.0 and v1.1 simply as a string, whereas starting from v2.0 it additionally restricts the value to the list of canonical timezone names from the IANA timezone database.

The GBFS validator uses the JSON schema to validate GBFS feeds. It autodetects the version by looking at the version field. If the version field is not present, the validator validates using v1.0, since the this version did not have a version field.

OTP deserialises GBFS using a Java model generated from the v2.3 JSON schema (https://github.com/MobilityData/gbfs-json-schema/tree/master/v2.3), regardless of the actual version of the GBFS feed being deserialised (v3.0 not yet supported).

The result is that when attempting to use GBFS feeds using non-canonical timezone names in OTP, it won't be parsed even if the validator claims it to be valid v1.0.

Note that the Java runtime doesn't really have anything to do with this yet, since it is still just deserialised into a (enum) string, not to any Java Timezone class.

IMHO, the proper way forward here is to get a clarification on the GBFS spec, whether links are allowed. Raise an issue here https://github.com/MobilityData/gbfs/issues. If it turns out to be the case, then the JSON schemas should be updated. The Java model will then in turn reflect this update and OTP will start accepting it. Whether OTP will break if non-canonical names are accepted, I have not investigated, but @leonardehrenfried might have?

Oxalin commented 1 month ago

As suggested https://github.com/MobilityData/gbfs/issues/697

Oxalin commented 1 month ago

Btw, your third example uses America/Toronto: https://saguenay.publicbikesystem.net/customer/gbfs/v2/en/system_information

I updated the description to prevent any more confusion, the timezone value of Accès Vélo Saguenay was indeed correct.

Oxalin commented 1 month ago

@testower About Bixi Montréal's feed, it uses version 1.0. Is v1.0 and v1.1 versions supported by OTP? If so, shouldn't this feed be supported no matter its timezone value since version 1.x was not enforcing the canonical timezone values?

testower commented 1 month ago

@Oxalin v1.0 and v1.1 are now deprecated. The OTP updater makes not claim to support these, but it works incidentally because they are "forward compatible" with version 2 in a certain sense. OTP uses the v2.3 generated model to deserialize. This means though that restrictions in v2 are applied - which is what you see here. IMHO we should prioritize v3 support, not backport fixes for v1.