mapbox / nepomuk

A public transit router for GTFS feeds (currently only static) written in modern c++
MIT License
24 stars 8 forks source link

Implement Data Provider #154

Open MoKob opened 7 years ago

MoKob commented 7 years ago

Depending on the amount of data / processing required, starting up a server can be a long process. Especially for testing our server infrastructure, we would like to avoid unnecessary overheads.

To do so (and as already outlined in https://github.com/mapbox/nepomuk/issues/3), we can think about implementing a data provider to supply data to our engine without a need to recreate the datasets after a failure / for any new startup of the engine.

As outlined in https://github.com/mapbox/nepomuk/issues/3, we should seriously consider a process that does not used shared memory but rather communication via zeromq. The reasoning here is that this process would allow for us to avoid any locking problems that come along with shared memory regions and to allow distribution of workloads onto different machine types.

A data provider should take the role of the what is currently handled in the master-service::dataset. Instead of having a dataset and allowing the creation of different data structures, the master service should hand this responsibility to a data provider that can be located anywhere. The master service handles the communication with the data provider and returns structures as we are used to.

To do so, we need to serialise all structures into PBF and deserialised them from PBF. The Provider should offer the functionality to load a raw GTFS feed from disk and put it into PBF. The PBF has then to be transferred via ZeroMQ to the MasterService which, in turn, returns the access as usual, hiding all the ZeroMQ shenanigans from the rest of the project.

/cc @daniel-j-h