TransitTracker / backend

Realtime transit overview for multiple cities across Canada
https://api.transittracker.ca
MIT License
15 stars 3 forks source link

run just the frontend with any GTFS-RT feed? #12

Closed derhuerst closed 3 years ago

derhuerst commented 3 years ago

Hey!

This question is similar to #9: I'd like to use the frontend you've built, but with any GTFS-RT feed, and without running an adapter API server behind.

I was planning to build a general-purpose (as in city-/feed-independent) GTFS-RT visualiser, and came across your project. Would it be possible to restructure this project so that the frontend can directly consume a GTFS-RT feed?

FelixINX commented 3 years ago

Hi @derhuerst I'm sorry I didn't see your message before, I was busy in the end of 2020.

I already have an idea in mind. You would be able to just drag and drop your protobuf file into Transit Tracker. I hope to get this done by the next month.

Again sorry for the delay!

derhuerst commented 3 years ago

By now, I have built gtfs-rt-inspector, a (somewhat barebones) GTFS-RT inspection & visualization thingy.

Let's see if we can collaborate and eventually merge into one project.

FelixINX commented 3 years ago

Looks nice, good job!

I'm open to collaborate on a future merge. This is currently the repo for the backend server which also host the frontend part, but I'm working on creating a dedicated Nuxt app for the frontend.

My idea for the "bring your own gtfs" feature was either to use client side Javascript or use a serverless app. What is your idea?

derhuerst commented 3 years ago

I'm open to collaborate on a future merge.

Happy to hear that!

This is currently the repo for the backend server which also host the frontend part, but I'm working on creating a dedicated Nuxt app for the frontend.

I don't care about which specific frontend lib is being used, as long as it's roughly React-like (mostly stateless tree of components, state modelled somewhere else) and not too complex.

derhuerst commented 3 years ago

My idea for the "bring your own gtfs" feature was either to use client side Javascript or use a serverless app. What is your idea?

A server-side implementation allows for much more efficient lookups (e.g. reading just one trip with a specific ID). On the other hand, backends always cause some maintenance efforts. From my experience, even "serverless" deployments need to be upgraded, the provider closes down or cancels its free plan, the deployment URL changes, etc.

Both with back-end as well as front-end GTFS handling, there will soon be multi-GB (unzipped) datasets involved, as transit data (finally) gets more integrated across cities, regions & countries.

It might be a little ambitious, but I think it would be cool to follow – or maybe extend but stay true to the idea – one of the projects that try to make transit data more client-queryable by formatting & serving it as Linked Data, sorted & chunked by semantical properties (e.g. departure time). @pietercolpaert's PhD has a high-level and very readable explainer on why & how. Linked Connections evolved the idea into a fully functioning system.

This might seem a little unrelated, but after thinking about it twice, I noticed that we actually try to solve the same problem: How do we make this chunk of GTFS client-accessible, with a good tradeoff between maintenance and efficiency?.

Instead of designing yet another system to allow client access to GTFS-like datasets, why not give this concept a twist, so that the data is formatted in a way that it naturally allows for efficient client access?

FelixINX commented 3 years ago

This might seem a little unrelated, but after thinking about it twice, I noticed that we actually try to solve the same problem: How do we make this chunk of GTFS client-accessible, with a good tradeoff between maintenance and efficiency?

Good question. Front-end only is feasible if you only want to analyze the realtime feeds, without linking the static data. Transit Tracker (TT) frontend already has the capability to display almost all information from a vehiclePosition feed. The backend also handle quite well large datasets. The data is returned using a single format API (old version, v2 wip) and is accessible to other developers that don't want to learn how protobuf works. What Transit Tracker lacks is support for other feeds (alerts and tripUpdates) since my app is not intended for the general public, it's intendedto transit enthusiasts.

I'm gonna work on a concept where you could drop a vehiclePosition file and then it would be processed by the backend and returned to the client. I hope to get it ready by next week, I'll keep you updated.

derhuerst commented 3 years ago

This might seem a little unrelated, but after thinking about it twice, I noticed that we actually try to solve the same problem: How do we make this chunk of GTFS client-accessible, with a good tradeoff between maintenance and efficiency?

Good question. Front-end only is feasible if you only want to analyze the realtime feeds, without linking the static data.

I meant that I think it's possible to process GTFS Static data client-side as well, with a bit of pre-processing of course.

[...] my app [TransitTracker] is not intended for the general public, it's intendedto transit enthusiasts.

What about transit enthusiasts/professionals in other cites & countries?

Are you planning to stay focused on your city/provider?

I'm gonna work on a concept where you could drop a vehiclePosition file and then it would be processed by the backend and returned to the client. I hope to get it ready by next week, I'll keep you updated.

You could do this client-side, right?

FelixINX commented 3 years ago

I meant that I think it's possible to process GTFS Static data client-side as well, with a bit of pre-processing of course.

I have never tried to load that much data in the browser, but let's try it!

What about transit enthusiasts/professionals in other cites & countries?

I'm open for expansion or having multiple instances of Transit Tracker for multiple regions. Support for custom GTFSRT feed is the first step in this direction.

You could do this client-side, right?

Yes, it would actually be easier. I'm still hoping to make a functional demo with vehicle Position by the end of the week.

derhuerst commented 3 years ago

What about transit enthusiasts/professionals in other cites & countries?

I'm open for expansion or having multiple instances of Transit Tracker for multiple regions. Support for custom GTFSRT feed is the first step in this direction.

I think we still have a misunderstanding here. 🤔 Why would there be multiple instances, if the GTFS-RT & GTFS loading was entirely client-side?

FelixINX commented 3 years ago

Sorry, I mean that I'll keep both options:

FelixINX commented 3 years ago

Here is the demo: dev.transittracker.ca/byod

Small or medium feeds are running ok, but large feed (STM for instance with 40 000+ trips) are less responsive. The app is still not 100% functional, only client side feeds are working at the moment. Here is what I have planned:

derhuerst commented 3 years ago

Here is the demo: dev.transittracker.ca/byod

Nice to see the progress!

Small or medium feeds are running ok, but large feed (STM for instance with 40 000+ trips) are less responsive.

Huh. Why is that? Are there some optimisations you have in mind?

I'm planning to work with very large feeds soon, e.g. the Germany-wide GTFS feed with >1m trips. Obviously, we can't store all of them in localStorage, so figuring out a smart way to serve GTFS data (statically) is needed, so that clients can access subsets of the data.

The app is still not 100% functional, only client side feeds are working at the moment.

I didn't fully understand this yet... What benefits do the server-side feeds provide?

  • [ ] Handle ZIP files

My very personal opinion: GTFS should gradually move away from zipping feeds: It's a bad archive format, it hinders portability & archivability a lot, and there's lots of tooling for serving a set of files in a compressed way.

  • [ ] Support for alert feed
  • [ ] Support for trip update
  • [ ] Support for URL (for feeds with CORS *)

👍

FelixINX commented 3 years ago

Huh. Why is that? Are there some optimisations you have in mind?

I'll be looking into that, IndexedDB should offer better performance since it has index and can be run asynchronously. Maybe optimize the parser also. I haven't tried it, but is your parser compatible with client-side JavaScript (eg. File) or just with NodeJS?

What benefits do the server-side feeds provide?

To continue to serve current users, who are mostly people who are not even aware that the GTFS format exists. Also in Quebec transit agencies love to put API key and CORS headers on their feeds so I have to download them through a server. I'll make sure that the app still works without a server, for those who want it. So no need to worry about that 👍

My very personal opinion: GTFS should gradually move away from zipping feeds: It's a bad archive format, it hinders portability & archivability a lot, and there's lots of tooling for serving a set of files in a compressed way.

I agree, I'll add it later. It's still the "standard" way, so for ease of use I'll add it later on.

derhuerst commented 3 years ago

I haven't tried it, but is your parser compatible with client-side JavaScript (eg. File) or just with NodeJS?

All of the actual GTFS logic is implemented in an environment-independent way: The higher level tools accept [async iterators]() of rows. From the readme:

streaming/iterative on sorted data

Whenever possible, all gtfs-utils tools will only read as little data into memory as possible. As public transportation systems will hopefully become more integrated over time, GTFS datasets will often be multiple GBs large. GTFS processing should work in memory-constrained Raspberry Pis or FaaS environments as well. [...]

data-source-agnostic

gtfs-utils does not make assumptions about where you read the GTFS data from. Although it has a built-in tool to read CSV from files on disk, anything is possible: in-memory buffers, streaming HTTP, dat/IPFS, etc.

gtfs-utils/read-csv, the default CSV reader, uses Node.js' fs API, but you could swap that with e.g. fetch.

Please open an Issue over there if you have questions.

FelixINX commented 3 years ago

Sorry for the delay, it took longer than expected to implement IndexedDb. All routes, trips and vehicles are now stored in the local browser database. Importing ~200 000 trips is taking a couple of seconds.

I have also improved the UI/UX. You can try everything at: https://dev.transittracker.ca/byod

derhuerst commented 3 years ago

It just tried importing the 2021-02-05 VBB GTFS feed's routes.csv & trips.csv files. Both imports worked, but there are some minor UX issues:

I wonder though, is there a way to use imported GTFS data with a live GTFS-RT feed?

And, in addition, is there a way to consume a GTFS-RT feed that combines both VehiclePositions & TripUpdates? Currently, I have to manually post-process my feed.

FelixINX commented 3 years ago

With trips.csv, the UI stopped showing the spinner, but then it took another ~10s until it showed the "217670 trips saved in your browser." message.

I'll be working on a fix later on.

If I'm importing a local file into the local browser storage, it shouldn't say "upload" but sth. like "import".

Thanks for the suggestion, it make sense.

If I leave the BYOD tab and open it again, it will again take a while check the number of locally stored routes/trips.

Hmm I am not able to reproduce, both in Firefox and Chrome. It takes about a second to fetch the count. What browser are you using?

The city feed picker in the top bar doesn't seem to work.

I simply haven't built it yet.

I wonder though, is there a way to use imported GTFS data with a live GTFS-RT feed?

To be sure, you are talking about a feature that will automatically fetch a GTFS-RT feed on a remote server? It's almost ready, I'll let you know when I push it.

And, in addition, is there a way to consume a GTFS-RT feed that combines both VehiclePositions & TripUpdates? Currently, I have to manually post-process my feed.

Only VehiclePositions are supported for now, but I am working on TripUpdates. When you import a file, if an existing entity id exists, it will be overwritten. So you can import as many files as you want, VehiclePositions, and in the future TripUpdates and Alerts.

derhuerst commented 3 years ago

If I leave the BYOD tab and open it again, it will again take a while check the number of locally stored routes/trips.

Hmm I am not able to reproduce, both in Firefox and Chrome. It takes about a second to fetch the count. What browser are you using?

Chromium 89 on macOS 10.15.7.

I wonder though, is there a way to use imported GTFS data with a live GTFS-RT feed?

To be sure, you are talking about a feature that will automatically fetch a GTFS-RT feed on a remote server? It's almost ready, I'll let you know when I push it.

I'm talking about consuming a (CORS-enabled) GTFS-RT feed client-side, like gtfs-rt-inspector does it.

And, in addition, is there a way to consume a GTFS-RT feed that combines both VehiclePositions & TripUpdates? Currently, I have to manually post-process my feed.

Only VehiclePositions are supported for now, but I am working on TripUpdates.

Just to explain why I'm asking: Currently it fails (silently in the console) because it tries to access feedEntity.vehicle.trip, but feedEntity.vehicle is null with mixed-TripUpdates-and-VehiclePositions-feeds.

FelixINX commented 3 years ago

Just push:

You can try to reset your browser store, it might help. Your browser might have an old version, with different keys, and that would explain the slow queries.

derhuerst commented 3 years ago

Importing my GTFS-RT feed (un-gzipped of course, GitHub didn't allow me to upload a plain .pbf file) failed with this error:

{
    name: "DataError",
    message: "Failed to execute 'put' on 'IDBObjectStore': Evaluating the object store's key path did not yield a value.",
    code: 0,
    stack: `\
Error: Failed to execute 'put' on 'IDBObjectStore': Evaluating the object store's key path did not yield a value.
    at IDBObjectStore.put (<anonymous>)
    at https://dev.transittracker.ca/_nuxt/7c38af6.js:2:379892
    at Jt (https://dev.transittracker.ca/_nuxt/7c38af6.js:2:343670)
    at new Yt (https://dev.transittracker.ca/_nuxt/7c38af6.js:2:343168)
    at Object.mutate (https://dev.transittracker.ca/_nuxt/7c38af6.js:2:379024)
    at Object.mutate (https://dev.transittracker.ca/_nuxt/7c38af6.js:2:395565)
    at https://dev.transittracker.ca/_nuxt/7c38af6.js:2:358113
    at l (https://dev.transittracker.ca/_nuxt/7c38af6.js:2:351963)
    at https://dev.transittracker.ca/_nuxt/7c38af6.js:2:352347
    at Oi (https://dev.transittracker.ca/_nuxt/7c38af6.js:2:350363)`,
}

I used https://vbb-gtfs.jannisr.de/2021-02-05/routes.csv and https://vbb-gtfs.jannisr.de/2021-02-05/trips.csv as GTFS Static files.

FelixINX commented 3 years ago

It's fixed now! Just tested from scratch and no errors. Also, the icon will now match the route_type field or fallback on the default value specified when creating the agency.

FelixINX commented 3 years ago

I have been busy last week, but I have pushed an update yesterday.

derhuerst commented 3 years ago

Importing worked now, using the 2021-02-12 VBB GTFS feed. It still takes ~10s until the nr of imported trips shows up.

When fetching a remote GTFS-RT feed (https://v0.berlin-gtfs-rt.transport.rest/feed), it fails with this error:

Uncaught (in promise) TypeError: d is undefined
    NuxtJS 53
        processCustomFeed
        processCustomFeed
        x
        dispatch
        dispatch
        dispatch
        loadRemote
        promise callback*loadRemote/<
        loadRemote
        x
        dispatch
        dispatch
        fetchRemoteUrl
        saveRemoteUrl
        promise callback*saveRemoteUrl
        Qt
        n
        Qt
        $emit
        click
        Qt
        n
        _wrapper
        jr
        de
        $r
        x
        v
        bo
        _update
        r
        get
        wn
        mount
        $mount
        init
        v
        v
        w
        v
        bo
        _update
        r
        get
        run
        gn
        ce
        ie
        promise callback*ee
        ce
        update
        update
        notify

I have a saved a slightly newer (and hence slightly different) version of the feed to my computer; Importing that file manually worked fine, I got the correct number of 904 vehicle positions.

FelixINX commented 3 years ago

I have corrected the remote URL feature. Regarding the long waiting time, there is not much I can do because it's just a slow IndexedDb query, but I'll think of something.

FelixINX commented 3 years ago

I will now close this issue since most features I wanted to include are now available and I consider the feature to be stable enough.

However, if you notice any bug or would like to see more features, please let me know!

I will add feature ideas on the new repo TransitTracker/frontend.

Thanks!