noi-techpark / sta-nap-export

NeTEx and SIRI export for STA (MaaS4Italy)
1 stars 0 forks source link

EPIC: As MaaS4Italy I would like to receive from the Open Data Hub a standard interface for planned and real-time mobility data (NeTEx and SIRI) #1

Open rcavaliere opened 4 months ago

rcavaliere commented 4 months ago

In the scope of the new MaaS4Italy project the Open Data Hub is request to provide a standard interface for its mobility data, so that it can be integrated in a new MaaS architecture to be developed at national level. The following specification provides details between available Open Data Hub data and the requested fields.

240701_PianificazioneSviluppi.docx

The services to be considered are:

Below the national specifications to be considered:

230711_Linee-guida-compilazione-NeTEx-IT-v.3.0.pdf

240502_SpecificaSIRI_v.1.0.3.pdf

DSRM-Architettura Target_Dati dinamici e sharing v1.2_signed.pdf (relevant: paragraph 2.4)

NeTEx exports can be compared with an XSD before being published, available there https://github.com/5Tsrl/netex-italian-profile

Update 14.05.2024: STA has also requested to include in the exports bike parking services, since they are bookable and interesting for MaaS providers. I have amended in track change mode the internal specification document with indication on how to include this additional data. Please also note that there has been a new version of the SIRI national standard, some fields for parking have changed (removed).

clezag commented 3 months ago

After discussing this in person, some more details

ohnewein commented 3 months ago

The deadline for the first PoC is set to end of March 2024

clezag commented 3 months ago

@rcavaliere First PoC is up for static parking data:

https://sta-netex.opendatahub.testingmachine.eu/netex/parking

I've changed the ID a bit and added IT:OpenDataHub (it has to be globally unique), aside from that it should be faithful to the document.

Booleans currently default to false if no value in OpenDataHub is present, maybe we should implement ternary logic (e.g. leave it empty in netex where we don't have data).

XSD validation is also not implemented yet.

rcavaliere commented 3 months ago

@clezag that's look good, great work. For the ID: this is not OK, since we need to put as prefix also the ISTAT code of the region. We could use OpenDataHub (in my view, not the proper solution, since the IDs should be related to the data, not to the platform), so something like it:ITH10:xxxx I send you a technical specification for the IDs in the public transport, which is under implementation by STA. 230116_NeTExProfileSouthTyrol_GlobalID.pdf

clezag commented 3 months ago

I've updated the ID to the format

IT:ITH10:Parking:<sanitized scode>

where sanitized scode is the scode, with all characters that are not characters, numbers, hypens or underscores replaced with underscores.

e.g. station with scode=me: Karl-Wolf02 becomes IT:ITH10:Parking:me__Karl-Wolf02

This is how I understand the specification of NeTEx (4.2.1), and also matches with their examples.

The document from STA is not consistent with the NeTEx spec:

Origin is not necessary, since our scode is unique for station type. Each parking station is guaranteed to have a unique scode, and this way we avoid chaos when we deprecate/change origin in the future

rcavaliere commented 3 months ago

@clezag this is fine for me! Thanks for the input for the STA IDs

clezag commented 3 months ago

A PoC for sharing services (currently bike sharing BZ only) is now available: https://sta-netex.opendatahub.testingmachine.eu/netex/sharing

A PoC for SIRI-FM is available here: https://sta-netex.opendatahub.testingmachine.eu/siri/fm/115 the last part of the URL is the Open Data Hub scode, currently for Parking stations only

I think now would be a good time to talk both internally and with STA to define the next steps.

On our side, we mainly need to figure out how to integrate all the missing data (e.g. bike/car models, Operator address, zone polygons etc.).

With STA I'd like to finalize the workflow, format and interfaces of the API, and possible validation, as soon as possible

rcavaliere commented 3 months ago

@clezag wonderful work! I will analyze your work more in deep starting next week. I agree with you, we can start sharing this work with STA and define with them the next steps. I will contact them and put you in CC!

rcavaliere commented 2 months ago

Tasks to be completed:

rcavaliere commented 1 month ago

@clezag I have started to insert the metadata. Parking areas and operators are completely filled.

rcavaliere commented 1 month ago

@clezag a remark: I would suggest to have at the end three different CompositeFrames:

rcavaliere commented 1 month ago

@clezag an update on the SIRI interfaces. I have got an informal information that we would need to provide these interfaces using the "Lite" approach foreseen in the standard, which is much simpler. At the end it is exactly to provide end-points that a 3rd party can access with simple HTTP requests. Check as examples https://developer.entur.org/pages-real-time-api So, perfectly in line with what you implemented. Let's wait for the formal specification to complete this part. It's probable that they will ask to implement in the API call some filtering options

rcavaliere commented 1 month ago

@clezag I have modified the user story description. The request is to add bike parking data (static -> NeTEx and dynamic -> SIRI) in the interfaces. Please note that we have also a new SIRI profile, some fields have been removed. In case of questions let me know!

240514_PianificazioneSviluppi.docx 240502_SpecificaSIRI_v.1.0.3.pdf

rcavaliere commented 1 month ago

@clezag I have completed the insertion of metadata, including for bike parking. Let me know if there are some doubts, I would suggest to include this data as metadata for the reference station types, and then include in the mapping with the NeTEx export. Let me know when you are done so that I can check if in the NeTEx exports we have all fields available with the proper values.

rcavaliere commented 1 month ago

@clezag I have had also a look at the current exports. In general, it looks very good! There is something I would change.

clezag commented 1 month ago

Just writing down what we discussed in person in addition to the points above:

Metadata

Bike Parking

rcavaliere commented 1 month ago

@clezag I have modified the user story description. The request is to add bike parking data (static -> NeTEx and dynamic -> SIRI) in the interfaces. Please note that we have also a new SIRI profile, some fields have been removed. In case of questions let me know!

240514_PianificazioneSviluppi.docx 240502_SpecificaSIRI_v.1.0.3.pdf

@clezag I have seen that in the document the integration modalities for car sharing data in the SIRI interface was missing. I have updated the document.

240521_PianificazioneSviluppi.docx

rcavaliere commented 1 month ago

@clezag an important update in relation to the implementation of the SIRI interfaces. We have received the attached documentation, available also in the description of this epic. Relevant here is at the beginning of paragraph 2.4: we have to provide two end-points, one for parking and one for sharing services. You can see the names of the end-points we have to implement and what kind of input parameters we have to support. Let me know in case of issues!

DSRM-Architettura Target_Dati dinamici e sharing v1.2_signed.pdf

clezag commented 3 weeks ago

@rcavaliere Just some progress updates:

static parking is now fully implemented:

In general I think it's best to limit the dataset to what we know works, and only add new providers/origins explicitly. To that end I have added a static config file in the repo where you have to explicitly add new origins in the future. I think this will be better than having the export fail every few months due to new data and someone having to go in there and fix it.

rcavaliere commented 3 weeks ago

@clezag agree. I have just checked, I inserted all metadata here: https://noibz-my.sharepoint.com/:x:/r/personal/c_zagler_noi_bz_it/_layouts/15/Doc.aspx?sourcedoc=%7B0E6FBB54-B477-4D50-B9A4-1E133893F3B5%7D&file=NeTEx%20missing%20data.xlsx&fromShare=true&action=default&mobileredirect=true

Are you checking this file, isn't it? Let me know if there is something missing.

Do you have some new NeTEx exports that I can check?

Thanks for your great work here

clezag commented 3 weeks ago

@rcavaliere Yes I've used this file.

On the static parking export, I've implemented all the missing points, so from my side I consider it complete. Only skidata is missing, which IMO is a "wontfix" until the data collector works again or better is in production. You can take a look here: https://sta-netex.opendatahub.testingmachine.eu/netex/parking

Sharing is here: https://sta-netex.opendatahub.testingmachine.eu/netex/sharing Bike sharing bolzano, which is the only one implemented, should now be complete. I've implemented the missing metadata (bicycles, bike models, operators, service constraint polygon).

Still missing are the other bike sharing providers, the car sharing, and then the new SIRI developments

rcavaliere commented 3 weeks ago

@clezag let's try to close this first implementation iteration.

I have looked at the NeTEx parking export, in my opinion it's nearly done. I would just one thing: I would consider as names for the operators not their origin, but a "standard name". I have made a proposal in the shared Excel file, see here in red:

Screenshot from 2024-06-10 15-04-33

Can you match the correspondent fields with these values?

clezag commented 3 weeks ago

@rcavaliere I've added the mapping, but I'm not completely sure I understood: Should I also use the operator name as ID/key? e.g. should "skidata" and "bicincitta" refer to the same single operator with ID = STA?. Or should I maintain two operators, who happen to have the same name and address?

rcavaliere commented 3 weeks ago

@clezag thanks! I would also change the IDs accordingly, so that we have consistency in the data.Yes, skidata and bicincittà have to refer to the same operator STA

rcavaliere commented 3 weeks ago

@clezag I have also tested the NeTEx sharing export, bike sharing Bolzano looks fine to me. Please make the same for the other remaining services! Can you manage these in the next days? So that we can finish this first NeTEx implementation. As you have probably seen I have also added the e-charging providers (modeled as parking), I would also add them!

clezag commented 3 weeks ago

@rcavaliere I've changed the IDs (with some special stuff because they have differing URLs per service)

Bike sharing merano and papin are now also online and complete.

AFAIK for static export now only car sharing and the bike parking stations are missing, right? In that case, I'm confident in wrapping it up this week.

clezag commented 3 weeks ago

@rcavaliere Carsharing is now up

clezag commented 3 weeks ago

@rcavaliere I misspoke, bikeparking is already implemented. I meant the e-charging providers are the only ones missing now. I did not find anything in the specs document, do you have any documentation?

rcavaliere commented 3 weeks ago

@clezag please check the updated version of the specifications, paragraph 2.1.4

240603_PianificazioneSviluppi.docx

rcavaliere commented 2 weeks ago

@rcavaliere Carsharing is now up

@clezag very good! However I have seen you have put everything in the same CompositeFrame. I would suggest to have a separate CompositeFrame for car sharing. So, one XML with two CompositeFrame, one for bike sharing services and one for car sharing service. Please have rename accordingly the IDs of CompositeFrame and MobilityServiceFrame. I will then check the specific fields foreseen, but if they are similar as for bike sharing I don't expect particular issues here

clezag commented 2 weeks ago

@rcavaliere I've implemented the echarging stations as parkings, but I'm currently using a dummy operator, do you have the operator information for the echarging providers?

origin = 
    - "1ucCQzAVGmvyRpeq-lIPffALQaWcG4LfPakc2mjt79fY" (This is the static echarging spreadsheet)
    - ALPERIA
    - DRIWE
    - IIT
    - route220

I've also added a rectangular geografic bounding box around the province of BZ when getting the echarging stations. Should I extend this to also include Trentino?

I would also (as I have done with the other services) specifically list the origins to consider, so to avoid issues when new data gets added to the Open Data Hub

rcavaliere commented 2 weeks ago

@clezag I have added the operators'information in the spreadsheet. I would suggest to include here only Alperia, DRIWE and route220 (they have already said YES to the sharing of their data in this format). I would not put any geographical boundary, in case let's if we get particular observations on this...

clezag commented 2 weeks ago

@rcavaliere

I think that's all the open points for static Netex, looking forward to your feedback

clezag commented 2 weeks ago

@clezag an important update in relation to the implementation of the SIRI interfaces. We have received the attached documentation, available also in the description of this epic. Relevant here is at the beginning of paragraph 2.4: we have to provide two end-points, one for parking and one for sharing services. You can see the names of the end-points we have to implement and what kind of input parameters we have to support. Let me know in case of issues!

DSRM-Architettura Target_Dati dinamici e sharing v1.2_signed.pdf

@rcavaliere I've started researching the SIRI implementation and have some doubts:

Is there some type of (even inofficial) documentation for the SIRI-LITE JSON format?
I've tried to look it up (also the sources you linked), but the best I could find are some examples, and it's not clear which parts are required by the spec and which are additional features of the implementation.

E.g. one services havepossibility to give a "last known timestamp" and only get the changes from that moment on. This would require substantial implementation work.

The architecture document just says "Adhering to the SIRI-LITE standard" but doesn't link it anywhere. Don't know if the official SIRI spec by CEN has anything, because it's paywalled. But to me it looks like the "lite" standard is not part of the SIRI spec.

I can do a "best effort" implementation mapping the xml to JSON, but since there is no validation, and no documentation, I have little way of knowing if it's correct.

rcavaliere commented 2 weeks ago

@clezag yes, I understand. What we can simply do is to consider the given fields and provide the answers as JSON and not as XML. Try to do like this, I will then make a first check. We will then provide the data as such and then understand with the national authorities if we need to adapt something

rcavaliere commented 2 weeks ago

DSRM-Architettura Target_Dati dinamici e sharing v1.2_signed.pdf

clezag commented 2 weeks ago

We will do a baseline implementation and then iterate on it once we get feedback:

rcavaliere commented 2 weeks ago

@rcavaliere

* Implemented separate composite frames (might have to select "show page source" in your browser, as it's not valid XML anymore without the larger Netex context)

* modified filters and operators for the echarging parking providers

I think that's all the open points for static Netex, looking forward to your feedback

The link with the "sharing" export seems to be broken, can you please check?

rcavaliere commented 2 weeks ago

@clezag wonderful work for the NeTex export. Looks everything fine, all services are included, really excellent! Just ensure that Davide is able to integrate your composite frames in the general NeTEx export, and we can close this first development sprint

rcavaliere commented 2 weeks ago

@clezag also the first SIRI feed available at https://sta-netex.opendatahub.testingmachine.eu/siri/fm/parking is OK. Please just add here all the services, and activate a similar end-point for sharing services, which could be e.g. https://sta-netex.opendatahub.testingmachine.eu/siri/fm/sharing

sseppi commented 2 weeks ago

@rcavaliere @clezag

During the meeting it emerged the need to expose this endpoint to the public and avoid to have hidden custom endpoints developed and maintained only for one customer.

rcavaliere commented 2 weeks ago

@sseppi yes of course we have in plan to do something like this! But please let's first complete this implementation, we are still in a first development phase.

sseppi commented 2 weeks ago

@rcavaliere sure

If I got it right during the conversation, the main point brought up by @ohnewein was about the name of the domain that should be more generic. In general our endpoints shouldn't be developed specifically for the usage of one customer, for this reason there shouldn't be the name of the customer in the domain.

Since there is the plan to the extend to netex, my proposal was to use directly netex.opendatahub.com .

@ohnewein: please comment if a got something wrong.

clezag commented 2 weeks ago

@rcavaliere I've added the missing services to SIRI parking.

On SIRI sharing however, I think we are missing a piece: For station based sharing services, it looks like the standard needs the BikesharingStations and CarsharingStations themselves registered as parking facilities, so that we can then reference them in SIRI as FacilityRef.

Should I go ahead and add them to the Netex parking export?

rcavaliere commented 2 weeks ago

@rcavaliere sure

If I got it right during the conversation, the main point brought up by @ohnewein was about the name of the domain that should be more generic. In general our endpoints shouldn't be developed specifically for the usage of one customer, for this reason there shouldn't be the name of the customer in the domain.

Since there is the plan to the extend to netex, my proposal was to use directly netex.opendatahub.com .

@ohnewein: please comment if a got something wrong.

Sure, the end-points that were set are not going to be the final one. Please don't consider those things until we are in a consolidated phase, we will define the end-points all together later.

rcavaliere commented 2 weeks ago

@rcavaliere I've added the missing services to SIRI parking.

On SIRI sharing however, I think we are missing a piece: For station based sharing services, it looks like the standard needs the BikesharingStations and CarsharingStations themselves registered as parking facilities, so that we can then reference them in SIRI as FacilityRef.

Should I go ahead and add them to the Netex parking export?

@clezag I had a deeper look at this. The specification defines how to provide real-time information with SIRI FM both in the case we have station-based or free-floating services. In the specification document I drafted, you should have all the details in paragraphs 3.2.4, 3.2.5 and 3.2.6, so in this first MVP let's try to implement this end-point according to these inputs. If you specific questions, let's analyze them together.

ohnewein commented 2 weeks ago

Sure, the end-points that were set are not going to be the final one. Please don't consider those things until we are in a consolidated phase, we will define the end-points all together later.

In the discussion today emerged, that the end-point will be used by other developers to implement their part. That means there will be created a dependency, which will than make the name changes more and more difficult.

We should never publish something for a single use-case, without generalizing for as many users/consumers as possible. Therefore, even if this specific end-point is meant for the specific developers mainly, we have also to use a generic name describing the data provided rather than conaining the name of the potential data consumer, and also include the end-point in our official documentation. It is an open platform with open end-points, which have to be as transparent as possible.

Considering all these discussions, I strongly suggest to always start with public end-point and general consumption from the beginning, reducing frictions, legacy dependencies, etc. later on.

rcavaliere commented 2 weeks ago

@clezag for real-time charging data, I have just checked that in this case we have to put as standard value for MonitoredCounting/CountedFeatureUnit instead of "bays" the default value "devices". Can you please fix this? I have corrected the reference specification document, which is in the main description of this user story

clezag commented 2 weeks ago

@rcavaliere All open points should now be implemented.

Also, we're now always exporting a complete valid Netex file (instead of fragments), to accommodate potential third party users of the API. I've also added an endpoint that combines both parking and sharing into one export: /netex

relevant to your request @ohnewein @sseppi

rcavaliere commented 1 week ago

@clezag I have checked the latest version of SIRI end-points. Everything should OK! Just two questions on the data exposed:

Image