Closed leonardehrenfried closed 5 months ago
@leonardehrenfried fine for me, also for the work on the documentation. For all deployment aspects, I would however ask a feedback to @dulvui
@leonardehrenfried Great work, looks fine to me!
We had a lot of issues with the graph build in the past and currently the graph build works event driven like this:
This approach is quite complex and often breaks, so we would highly prefer your approach. The only thing we noticed is, that this graph files change only every few months, so a daily builds would be too much. Keeping an event driven approach or at least moving the "if the gtfs file changed" check to the github action. What do you think about this?
@rcavaliere I will deploy the new versions directly to https://journey.opendatahub.testingmachine.eu, since I don't think that we will make changes to the old version. Or do you prefer a third, separate endpoint?
I think it would not be hard to add a step to Github CI to compare the age of the container image with the age of the STA feed. If STA is newer a graph is built. If the container is newer nothing happens.
@rcavaliere I will deploy the new versions directly to https://journey.opendatahub.testingmachine.eu, since I don't think that we will make changes to the old version. Or do you prefer a third, separate endpoint?
@dulvui yes, no problem on this! Reuse the existing end-point
@dulvui BTW, how do you determine the creation date of the STA feed?
It doesn't have a last-modified header:
curl -I https://gtfs.api.opendatahub.com/v1/dataset/sta-time-tables/raw
HTTP/2 200
access-control-allow-origin: *
content-disposition: attachment;filename=sta-time-tables.zip
content-type: application/zip
date: Wed, 24 Jan 2024 11:57:28 GMT
server: Caddy
x-powered-by: Express
content-length: 48425508
And https://gtfs.api.opendatahub.com/v1/dataset/sta-time-tables/ also doesn't contain any kind of date.
@leonardehrenfried Great then lets add this additional step to the CI and see if it works. I will try to move all the calculation logic and process to the Github Action before the next sprint.
Changes in the file get detected by downloading the file, creating the hash of it and then periodically comparing the file with the old hash. You can find part of the logic here https://github.com/noi-techpark/odh-mentor-otp/blob/e134a66135ff615adb9864e932029f0af55eeb08/gtfs-import-task/gtfs-download.sh#L39 But I saw that it is a quite big and complex bash script, so I might refactor that too.
I'm still a bit unsure which parts of the repo are my responsibility and which ones are @dulvui's so I'd be happy if you could guide me towards the parts you expect me to take over.
Basically, other than documentation, what else should I do during this sprint?
@leonardehrenfried I think everything that has to do with deployment and graph building would be my part, the rest is yours. I'll merge your first draft now and adapt the github actions, if needed. I think by the end of the week your changes might be already be online on our testingmachine. Would that work for you? So I'll merge now
Yes that sounds good.
This is my first attempt at moving this repo to OTP2.4. To be precise it is the snapshot version
2.5.0_2024-01-19T14-50
.It introduces a few changes to how things were done previously and I want to get feedback if this is the correct direction or if you want me to stay compatible with how things were done beforehand.
Graph build & deployment
Previously, the graphs were built in the OTP container itself. After having done this for a few years I recommend to new deployments the following:
OSM extraction
Furthermore, rather than using a complicated query to get the correct OSM area, I've switched to the following strategy:
This is probably also faster (but I haven't tested it).
Feedback
I would like to hear if plan is ok or if you have thought about how you would like building and deployment to work.
Further work
If you tell me that my plan is ok, I'm going to improve and clean up the documentation which at the moment is not very up to date.