Restore time measurement on request end

Closes https://github.com/elastic/rally/issues/1860.

Comparison before/after change for Universal Profiling (same ES commit ID, same environment) does not show significant differences:

``` | Metric | Task | Baseline | Contender | Diff | Unit | Diff % | |--------------------------------------------------------------:|-------------------:|----------------:|----------------:|-----------:|-------:|---------:| | Cumulative indexing time of primary shards | | 341.444 | 346.032 | 4.5878 | min | +1.34% | | Min cumulative indexing time across primary shard | | 0 | 0 | 0 | min | 0.00% | | Median cumulative indexing time across primary shard | | 0.0121667 | 0.0117417 | -0.00042 | min | -3.49% | | Max cumulative indexing time across primary shard | | 60.0589 | 60.621 | 0.56207 | min | +0.94% | | Cumulative indexing throttle time of primary shards | | 0 | 0 | 0 | min | 0.00% | | Min cumulative indexing throttle time across primary shard | | 0 | 0 | 0 | min | 0.00% | | Median cumulative indexing throttle time across primary shard | | 0 | 0 | 0 | min | 0.00% | | Max cumulative indexing throttle time across primary shard | | 0 | 0 | 0 | min | 0.00% | | Cumulative merge time of primary shards | | 143.833 | 145.935 | 2.10198 | min | +1.46% | | Cumulative merge count of primary shards | | 4083 | 4025 | -58 | | -1.42% | | Min cumulative merge time across primary shard | | 0 | 0 | 0 | min | 0.00% | | Median cumulative merge time across primary shard | | 0 | 0 | 0 | min | 0.00% | | Max cumulative merge time across primary shard | | 19.9972 | 20.3133 | 0.31615 | min | +1.58% | | Cumulative merge throttle time of primary shards | | 64.1877 | 64.8679 | 0.6802 | min | +1.06% | | Min cumulative merge throttle time across primary shard | | 0 | 0 | 0 | min | 0.00% | | Median cumulative merge throttle time across primary shard | | 0 | 0 | 0 | min | 0.00% | | Max cumulative merge throttle time across primary shard | | 8.96373 | 9.1456 | 0.18187 | min | +2.03% | | Cumulative refresh time of primary shards | | 82.3129 | 83.0968 | 0.78387 | min | +0.95% | | Cumulative refresh count of primary shards | | 51874 | 51250 | -624 | | -1.20% | | Min cumulative refresh time across primary shard | | 0 | 0 | 0 | min | 0.00% | | Median cumulative refresh time across primary shard | | 0.00319167 | 0.00278333 | -0.00041 | min | -12.79% | | Max cumulative refresh time across primary shard | | 13.8886 | 14.0101 | 0.12143 | min | +0.87% | | Cumulative flush time of primary shards | | 18.5045 | 18.7795 | 0.2749 | min | +1.49% | | Cumulative flush count of primary shards | | 7897 | 7809 | -88 | | -1.11% | | Min cumulative flush time across primary shard | | 6.66667e-05 | 5e-05 | -2e-05 | min | -25.00% | | Median cumulative flush time across primary shard | | 0.000225 | 0.0003 | 7e-05 | min | +33.33% | | Max cumulative flush time across primary shard | | 3.05088 | 2.98485 | -0.06603 | min | -2.16% | | Total Young Gen GC time | | 14.792 | 15.021 | 0.229 | s | +1.55% | | Total Young Gen GC count | | 1282 | 1275 | -7 | | -0.55% | | Total Old Gen GC time | | 0 | 0 | 0 | s | 0.00% | | Total Old Gen GC count | | 0 | 0 | 0 | | 0.00% | | Dataset size | | 35.3034 | 34.6148 | -0.68865 | GB | -1.95% | | Store size | | 35.3034 | 34.6148 | -0.68865 | GB | -1.95% | | Translog size | | 6.04428e-06 | 6.04428e-06 | 0 | GB | 0.00% | | Heap used for segments | | 0 | 0 | 0 | MB | 0.00% | | Heap used for doc values | | 0 | 0 | 0 | MB | 0.00% | | Heap used for terms | | 0 | 0 | 0 | MB | 0.00% | | Heap used for norms | | 0 | 0 | 0 | MB | 0.00% | | Heap used for points | | 0 | 0 | 0 | MB | 0.00% | | Heap used for stored fields | | 0 | 0 | 0 | MB | 0.00% | | Segment count | | 1114 | 1047 | -67 | | -6.01% | | Total Ingest Pipeline count | | 0 | 0 | 0 | | 0.00% | | Total Ingest Pipeline time | | 0 | 0 | 0 | ms | 0.00% | | Total Ingest Pipeline failed | | 0 | 0 | 0 | | 0.00% | | Min Throughput | bulk | 71478.7 | 71608.6 | 129.93 | docs/s | +0.18% | | Mean Throughput | bulk | 74485.7 | 75829.4 | 1343.66 | docs/s | +1.80% | | Median Throughput | bulk | 74828.2 | 76156.8 | 1328.63 | docs/s | +1.78% | | Max Throughput | bulk | 75618.9 | 76591.4 | 972.453 | docs/s | +1.29% | | 50th percentile latency | bulk | 133.401 | 131.642 | -1.75928 | ms | -1.32% | | 90th percentile latency | bulk | 149.524 | 147.412 | -2.11223 | ms | -1.41% | | 99th percentile latency | bulk | 179.004 | 179.05 | 0.04653 | ms | +0.03% | | 99.9th percentile latency | bulk | 273.525 | 241.905 | -31.6196 | ms | -11.56% | | 99.99th percentile latency | bulk | 1805.74 | 1887.77 | 82.0235 | ms | +4.54% | | 100th percentile latency | bulk | 2035.3 | 2064.19 | 28.8873 | ms | +1.42% | | 50th percentile service time | bulk | 133.322 | 131.612 | -1.70988 | ms | -1.28% | | 90th percentile service time | bulk | 149.517 | 147.388 | -2.12968 | ms | -1.42% | | 99th percentile service time | bulk | 179.161 | 179.092 | -0.06962 | ms | -0.04% | | 99.9th percentile service time | bulk | 273.621 | 242.772 | -30.8494 | ms | -11.27% | | 99.99th percentile service time | bulk | 1805.74 | 1708.95 | -96.7928 | ms | -5.36% | | 100th percentile service time | bulk | 2035.3 | 2064.19 | 28.8873 | ms | +1.42% | | 50th percentile processing time | bulk | 139.311 | 137.757 | -1.55339 | ms | -1.12% | | 90th percentile processing time | bulk | 156.409 | 153.933 | -2.47576 | ms | -1.58% | | 99th percentile processing time | bulk | 185.832 | 185.482 | -0.35039 | ms | -0.19% | | 99.9th percentile processing time | bulk | 279.467 | 247.158 | -32.309 | ms | -11.56% | | 99.99th percentile processing time | bulk | 1791.71 | 1822.13 | 30.4184 | ms | +1.70% | | 100th percentile processing time | bulk | 2039.99 | 2069.31 | 29.325 | ms | +1.44% | | error rate | bulk | 0 | 0 | 0 | % | 0.00% | | Min Throughput | get-stacktraces | 0.400355 | 0.400335 | -2e-05 | ops/s | -0.00% | | Mean Throughput | get-stacktraces | 0.400417 | 0.400395 | -2e-05 | ops/s | -0.01% | | Median Throughput | get-stacktraces | 0.400414 | 0.400391 | -2e-05 | ops/s | -0.01% | | Max Throughput | get-stacktraces | 0.400495 | 0.400468 | -3e-05 | ops/s | -0.01% | | 50th percentile latency | get-stacktraces | 812.696 | 827.718 | 15.022 | ms | +1.85% | | 90th percentile latency | get-stacktraces | 820.833 | 837.839 | 17.0059 | ms | +2.07% | | 99th percentile latency | get-stacktraces | 851.633 | 849.319 | -2.31428 | ms | -0.27% | | 100th percentile latency | get-stacktraces | 884.832 | 853.278 | -31.5544 | ms | -3.57% | | 50th percentile service time | get-stacktraces | 809.951 | 825.316 | 15.3652 | ms | +1.90% | | 90th percentile service time | get-stacktraces | 818.56 | 835.565 | 17.0056 | ms | +2.08% | | 99th percentile service time | get-stacktraces | 849.907 | 846.485 | -3.42253 | ms | -0.40% | | 100th percentile service time | get-stacktraces | 882.751 | 850.519 | -32.2313 | ms | -3.65% | | 50th percentile processing time | get-stacktraces | 810.889 | 826.343 | 15.4547 | ms | +1.91% | | 90th percentile processing time | get-stacktraces | 819.491 | 836.607 | 17.1157 | ms | +2.09% | | 99th percentile processing time | get-stacktraces | 850.804 | 847.678 | -3.12627 | ms | -0.37% | | 100th percentile processing time | get-stacktraces | 883.665 | 851.555 | -32.1107 | ms | -3.63% | | error rate | get-stacktraces | 0 | 0 | 0 | % | 0.00% | | Min Throughput | get-flamegraph | 0.400589 | 0.400587 | -0 | ops/s | -0.00% | | Mean Throughput | get-flamegraph | 0.400693 | 0.400691 | -0 | ops/s | -0.00% | | Median Throughput | get-flamegraph | 0.400686 | 0.400684 | -0 | ops/s | -0.00% | | Max Throughput | get-flamegraph | 0.400822 | 0.40082 | -0 | ops/s | -0.00% | | 50th percentile latency | get-flamegraph | 934.587 | 950.166 | 15.5781 | ms | +1.67% | | 90th percentile latency | get-flamegraph | 950.912 | 966.365 | 15.4525 | ms | +1.63% | | 99th percentile latency | get-flamegraph | 968.494 | 984.465 | 15.9706 | ms | +1.65% | | 100th percentile latency | get-flamegraph | 968.654 | 990.845 | 22.1913 | ms | +2.29% | | 50th percentile service time | get-flamegraph | 931.756 | 947.396 | 15.6401 | ms | +1.68% | | 90th percentile service time | get-flamegraph | 949.439 | 963.578 | 14.1392 | ms | +1.49% | | 99th percentile service time | get-flamegraph | 966.619 | 982.712 | 16.0939 | ms | +1.66% | | 100th percentile service time | get-flamegraph | 966.749 | 989.329 | 22.5803 | ms | +2.34% | | 50th percentile processing time | get-flamegraph | 932.667 | 948.288 | 15.6215 | ms | +1.67% | | 90th percentile processing time | get-flamegraph | 950.298 | 964.599 | 14.3012 | ms | +1.50% | | 99th percentile processing time | get-flamegraph | 967.456 | 983.625 | 16.169 | ms | +1.67% | | 100th percentile processing time | get-flamegraph | 967.593 | 990.193 | 22.5992 | ms | +2.34% | | error rate | get-flamegraph | 0 | 0 | 0 | % | 0.00% | | Min Throughput | get-topn-functions | 0.400788 | 0.400767 | -2e-05 | ops/s | -0.01% | | Mean Throughput | get-topn-functions | 0.400928 | 0.400902 | -3e-05 | ops/s | -0.01% | | Median Throughput | get-topn-functions | 0.400919 | 0.400894 | -3e-05 | ops/s | -0.01% | | Max Throughput | get-topn-functions | 0.401102 | 0.401071 | -3e-05 | ops/s | -0.01% | | 50th percentile latency | get-topn-functions | 710.66 | 721.021 | 10.3611 | ms | +1.46% | | 90th percentile latency | get-topn-functions | 723.538 | 732.646 | 9.10738 | ms | +1.26% | | 99th percentile latency | get-topn-functions | 736.206 | 753.293 | 17.0872 | ms | +2.32% | | 100th percentile latency | get-topn-functions | 738.872 | 759.158 | 20.2858 | ms | +2.75% | | 50th percentile service time | get-topn-functions | 707.629 | 718.16 | 10.5315 | ms | +1.49% | | 90th percentile service time | get-topn-functions | 721.65 | 730.139 | 8.48897 | ms | +1.18% | | 99th percentile service time | get-topn-functions | 734.073 | 750.279 | 16.2055 | ms | +2.21% | | 100th percentile service time | get-topn-functions | 736.643 | 756.105 | 19.4623 | ms | +2.64% | | 50th percentile processing time | get-topn-functions | 708.433 | 719.031 | 10.5983 | ms | +1.50% | | 90th percentile processing time | get-topn-functions | 722.488 | 730.99 | 8.50196 | ms | +1.18% | | 99th percentile processing time | get-topn-functions | 735.059 | 751.081 | 16.0219 | ms | +2.18% | | 100th percentile processing time | get-topn-functions | 737.465 | 757.134 | 19.6689 | ms | +2.67% | | error rate | get-topn-functions | 0 | 0 | 0 | % | 0.00% | ```

elastic / rally

Restore time measurement on request end #1861