abrensch / brouter

configurable OSM offline router with elevation awareness, Java + Android
MIT License
476 stars 115 forks source link

Use better named `enums` for voice hints internally #591

Closed rkflx closed 12 months ago

rkflx commented 1 year ago

The re-indexing revert is concise and contained in the first patch. Let me know if you prefer to keep or drop the additional cleanup patches on top of that (see commit messages for why I think those might be beneficial).

Additional notes:

rkflx commented 1 year ago

If and when to support strings in the GeoJSON/JSON API (either mandatory or as an option) can be decided later, depending on @nrenner's plans.

@nrenner indicated moving to a string-based GeoJSON API would be accepted (how exactly is to be clarified). Nevertheless, since reverting the indexing change in BRouter now seems to be an acceptable option too, this reduces pressure to make this a now-or-never migration (to avoid disrupting everything twice).

Therefore I'd like to propose the following procedure:

  1. Revert the re-indexing and move to enums at least internally (this PR). – This makes existing BRouter-Web code compatible to both BRouter 1.7.X and 1.6.3 at the same time again.
  2. Add an option to BRouter to emit strings in the GeoJSON API. – Nothing will change, but it will make BRouter compatible with future BRouter-Web version already.
  3. Wait until bikerouter.de migrated to a BRouter release containing the option for the string-based API.
  4. Switch BRouter-Web (and thus later bikerouter.de too) to actually use the string-based API.
afischerdev commented 1 year ago

Could you split the PR into re-index and add enums? The first two commits 82fecf9 and d98b106 bring the expected changes only on json export.

The enums need more testing. But with a split you get soon a library for your work on brouter-web.

quaelnix commented 1 year ago

The enums need more testing.

Why? The last two commits both only contain non-functional code changes or did I miss something?

afischerdev commented 1 year ago

I would like to test against speed and space requirements.

rkflx commented 1 year ago

Could you split the PR into re-index and add enums?

Done. See #594 for the revert. I kept this PR for the enum change to avoid having to move PR comments around too much. Please keep in mind that GitHub does not support stacked PRs for external contributors, so you will see the other commits here too until the other PR has been merged and I will have been able to rebase this PR.

But with a split you get soon a library for your work on brouter-web.

As in tagging a new release which has the API break fixed? That would be great. I'd recommend clearly stating in the release notes that 1.7.0 and 1.7.1 should not be used anymore due to the API break by anyone relying on the GeoJSON HTTP API or the JSON Java API for voice hints.

I would like to test against speed and space requirements.

I doubt there will be a measureable impact, because enums should be relatively well optimized in modern toolchains, we are mostly comparing or assigning enum constants (vs. creating new objects with type enum Command) and even then voice hints are probably rarely what's dominating in performance profiles compared to way finding.

But sure, better be safe than sorry. Could you share your performance testing setup for others to replicate? I assume you have been testing your routing profile changes wrt. to performance too and already have a baseline.

rkflx commented 1 year ago

Rebased on master.

quaelnix commented 12 months ago

I would like to test against speed and space requirements.

@afischerdev the speed difference seems to be negligible on my machine (less than 0.1%, tested with 1k random routes).

rkflx commented 12 months ago

@quaelnix

less than 0.1%

Thanks for testing. I expected nothing else after establishing the overall impact of voice hint processing in #531. If you want to see some actual performance regressions I'd recommend comparing 1.6.0 against 1.7.2.

@afischerdev

space requirements

It would be interesting to know the number of concurrent users served by brouter.de (min/max/med), the average route length and the amount of free memory on the server. Without those details it's hard to define acceptance criteria for this PR and PRs in general.

Also, could you please detail why you suspect this PR will cause regressions in resource usage. Are there any old PRs to look at how performance testing was handled in the past?

afischerdev commented 12 months ago

Here are my test results. I followed the test idea of 50km Darmstadt - Heidelberg, profile trekking.brf. But I know from other tests there will be a cache issue when running a loop. The first run gets the job done, everyone else benefits. So I put the timode=2 to start, followed by timode=0 And I add a time collector on voicehint generation

The first two block are the enum variante (timode=2, timode=0), then current lib. First columne is runtime for voicehint generation, second is total time

enum_plain

I also did a cold cache test (test after restart, no cache)

Enum 5.4366 - 1816
lib  3.9613 - 1703

But I think this can be ignored, it's not a normal situation.

The size gap between the classes is 3729 bytes. That means in a 1:1 relation:

trekking 164 wpt x 3729 =   611.556
car-fast  13 wpt x 3729 =    48.477
hiking-m 357 wpt x 3729 = 1.331.253

I guess java will find an optimization on this, but I didn't test it.

quaelnix commented 12 months ago

@afischerdev, runtime of https://github.com/abrensch/brouter/blob/0b6608eddbeaad443e54825a27af769cd510557a/brouter-core/src/main/java/btools/router/RoutingEngine.java#L589 measured with System.nanoTime(): processVoiceHints We should stop spending time on this. There is literally no measureable difference.

rkflx commented 12 months ago

@afischerdev Thank you for your effort to test this.

Enum 5.4366 - 1816
lib  3.9613 - 1703

Assuming those are statistically significant results, that effect of the PR looks quite concerning indeed.

The size gap between the classes is 3729 bytes.

How has this been measured? I'd still be interested in this, independent from this PR. That's a lot of memory.

That means in a 1:1 relation but I didn't test it

I thought the whole point of measuring memory consumption was to check how enums scale?

I would like to test against speed and space requirements.

Did I miss where you detailed what those requirements actually are?

@quaelnix

There is literally no measurable difference.

This might be true, and there is plenty of StackOverflow discussion too on ints vs. enums as well as on how to perform statistically sound memory usage and response time measurements, including how to properly report results. But has it ever stopped people from arguing anyway?

From past experience, the moment a PR is blocked on the grounds of int vs. enum performance, any such PR can be considered derailed, with only one sensible way out:

We should stop spending time on this.

Agreed. Let's close this for good.

I'm all for making sure a PR does not breaks things, but the amount of back and forth which ensues each time, even for small changes, is a bit excessive. It's not just me, as this is frequently occurring in PRs of other contributors too.

Therefore I'd like to propose the following procedure

Probably not gonna happen, not worth the trouble.

quaelnix commented 12 months ago

@afischerdev, it was a mistake that we did not merge this pull request without further ado. The improved readability and maintainability would have easily outweighed the possible performance loss of at most a few microseconds per route request.

afischerdev commented 12 months ago

@quaelnix

it was a mistake

May be, but I didn't close it. But I'm with @rkflx there is no sense to discuss again what others have already done. Until we have new facts on better measurement. And I agreed with you the timing results are not significant. So I don't mind on both ways.

@rkflx But I don't understand why you canceled the new idea of string based voice hints. This is possible in both variants.

It was just a file based size measurement. I got three classes: VoiceHint, VoiceHint$1, VoiceHint$Command. When you assume the extra classes are used only in a single way, you get

trekking 164 wpt =    67.136
car-fast  13 wpt =     8.397
hiking-m 357 wpt =   142.213

All of these are guessed values only. The silver bullet would be the size of the ArrayList. But who wants to spend more time doing that?