Can we significantly reduce memory required for cell metrics in MLD?

SiarheiFedartsou commented 1 year ago

Problem

.cell_metrics is one of the main contributors to our memory consumption in case of MLD used. It mainly caused by the fact that we support exclude classes. For each exclude class we duplicate huge array of weights/durations/distances, because due to presence of node of particular class in cells these values can be different. I looked into how these values are actually different from each other for different exclude classes and it was not a big surprise that only <1-2% of values have any actually difference, the vast majority of values are exactly the same, i.e. in other words we duplicate huge amount of data! And each new exclude class makes this problem even more painful(especially if we will want to start supporting combination of exclude classes in the future or smth like this)...

How can we try to solve the problem

We can try to store only "base" cell metrics as we store it now(in a view of huge continuous array) and also store a bit array with 1 for those values which are different for any of exclude classes and only for them allocate separate area where we would store it in "compressed" way(may be sorted by index or we could even try smth smarter like minimal perfect hash).

Potential problems?

The main potential problem here is indirection in data access which we introduce here what can cause performance downgrade.

Sorry for not very good explanation(probably will add some more numbers and text later), just got an idea and was looking forward to discuss :)

TheMarex commented 1 year ago

The slowdown due to indirection will be significant for this code path: MLD spends most of its time traversing these arrays, anything you can do to keep things cache local helps a ton. Depending on your requirement is may be okay to have much slower queries for certain exclude classes (like exclude=motorway,ferry) but the premises of the original implementation was to not have those penalties.

One problem with the data layout in OSRM is that it keeps both duration, weight and distance. A much more space-efficient storage is encoding speed/rate (as uint8) and distance (as uint32) and computing duration = distance / speed / weight = distance / rate at runtime. That saves 50% of memory, but is a bit more annoying to execute since this is so deeply baked in tons of the code.

SiarheiFedartsou commented 1 year ago

Thanks for sharing your thoughts @TheMarex !

A much more space-efficient storage is encoding speed/rate (as uint8) and distance (as uint32) and computing duration = distance / speed / weight = distance / rate at runtime. That saves 50% of memory, but is a bit more annoying to execute since this is so deeply baked in tons of the code.

Do you think it will have less negative performance impact taking into account that we will have to do this computation "on the fly"? May be we can refactor code somehow to have different strategies for people who want blazing speed(but agree to pay for it with memory) and who wants to save memory?

Also I have quite a lot of ideas which can be possible to implement if we can make weights more "dynamic", e.g. we can ease time based restrictions usage a lot if just bake knowledge about it to weights in some way(roughly speaking store not just weight, but vector of weights for different time conditions). While it definitely will make OSRM slower, but it will give more flexibility to it. But agree that it is kind of "against" OSRM philosophy to be as fast as possible.

Project-OSRM / osrm-backend