noi-techpark / odh-mentor-otp

4 stars 8 forks source link

A3 Automatic mechanisms for non-real-time data import #7

Closed rcavaliere closed 2 years ago

rcavaliere commented 3 years ago

@bertolla will implement this mechanism once the shared repository (where the GTFS exports are periodically uploaded) with STA is implemented

rcavaliere commented 3 years ago

@bertolla @zabuTNT @stefanocudini finally we have the new GTFS export: https://cloud.opendatahub.bz.it/index.php/s/j3qetDsFde6H2pR

@bertolla can you run the OTP graph with this new export? We should then see all data with shapes. The real-time data is available on all trains and city bus services in Bolzano and Merano (SASA) @zabuTNT and @stefanocudini can you please check that the coupling planned vs real-time data is ok or there is something that has to be fixed?

Thanks to you all for the work!

bertolla commented 3 years ago

Should be working correctly now. Cheers, Patrick

stefanocudini commented 3 years ago

hi @bertolla

in your infrastructure config here: https://github.com/noi-techpark/odh-mentor-otp/blob/development/infrastructure/docker-compose.run.calculate.yml missing this environment var: GTFS_RT_URL=https://efa.sta.bz.it/gtfs-r/

for our PR we refer to the operation of the docker-compose.yml file I don't modify the rest of the configuration inside infrastructure because I have no way to test it. I think this is a good practice

zabuTNT commented 3 years ago

@bertolla not only GTFS_RT_URL, but GTFS_FEED_ID miss too.

stefanocudini commented 3 years ago

hi @rcavaliere @bertolla I have a another suggestion for the generation of shapes.txt within the new gtfs. This is a row sample: csv "0-108-21a-1.1.R", "46.1982403962238", "11.1253506643881", "26", "1199.96" the shapes.txt file is large 191MB, I believe that a good part of this size is to an excessive number of decimal digits to describe the coordinates. We have evaluated that a good compromise of precision and size/generation time of the graph to be processed is 6 digits after comma.

Also I currently believe that excessive precision also degrades the loading of the tracks on the client / browser side.

rcavaliere commented 3 years ago

@stefanocudini good input. Looking at the GTFS data I think there is something to improved, for example the shapes are not properly visualized in all types of search. But could this be related to the client, and not to the GTFS dataset?

rcavaliere commented 3 years ago

OK, what now we miss is to start with STA the automatization of the new imports of GTFS files. But let's first close the other issues, such as #8, #11 and #15

rcavaliere commented 3 years ago

Will be defined in today's Sprint Meeting with OpenMove and STA

rcavaliere commented 3 years ago

Check this https://github.com/OneBusAway/onebusaway-gtfs-modules this should make automatic evaluations and highlight is a GTFS export has changed

rcavaliere commented 3 years ago

In case we can use the checksum of the ZIP file

rcavaliere commented 3 years ago

Proposal for STA: upload of checksum and control of this file

rcavaliere commented 3 years ago

@zabuTNT @stefanocudini @bertolla question: the checksum shouldn't be already part of the ZIP file? I don't think we need something different when the GTFS export is generated. Or shall STA do something different / more

stefanocudini commented 3 years ago

@bertolla @rcavaliere I am doing tests about this. The optimal solution would be to simply enable the HASH command on the ftp server configurations. Depending on the type of server it could be a very fast thing and also convenient in other cases of file distribution via ftp. this option allows any ftp client to quickly verify that distributed files have been modified without doing data transfers.

if this is not possible we are already working on the less optimal but working solution

about FTP HASH command(https://www.jscape.com/blog/bid/48215/New-FTP-Command-HASH-for-Requesting-Hash-of-a-File)

rcavaliere commented 3 years ago

@stefanocudini @zabuTNT @bertolla STA does not know how to do this, needs support on how to configure this on IIS

stefanocudini commented 3 years ago

@rcavaliere our latest PR works even without this optimization.

rcavaliere commented 3 years ago

ok, let's discuss this later

rcavaliere commented 3 years ago

Currently implemented, at present the system is configured so to download the GTFS export and to start the rebuild of the OTP graph once a day (at 3 AM). To be understood if we are able to support STA to provide us the checksum functionality (HASH command). @rcavaliere provides some inputs to STA (e.g. https://blogs.iis.net/robert_mcmurray/simple-utility-to-calculate-file-hashes)

rcavaliere commented 3 years ago

@stefanocudini @zabuTNT @bertolla unfortunately STA is not able to implement the checksum functionality... so let's leave everything as it is now