NoAvailableAlias / nano-signal-slot

Pure C++17 Signals and Slots
MIT License
407 stars 60 forks source link

Benchmark build enviroment and parameters #6

Closed fr00b0 closed 9 years ago

fr00b0 commented 9 years ago

Hello, I'm trying to run your performance benchmarks locally, but can't find any information of what compiler and what optimizations was used building the benchmark binary. Furthermore, no information is available about what parameters the was used when running the benchmarks. The benchmark program asks for a time limit in milliseconds. It would be nice to know what limit was used to produce the result table in the README.

I'm looking at the benchmarks in the FT branch, and I'm unsuccessful at reproducing the low scores for the nod library compared to the other libraries. I got curious as to why the library scored so poorly, but can't reproduce the result without knowing the benchmarking parameters.

NoAvailableAlias commented 9 years ago

Hi, the only input to the benchmark is the maximum accumulated sample time in milliseconds. This parameter only influences the accuracy of the benchmarks as increasing causes more samples to be collected. The benchmark results were compiled with Visual Studio 2013 using release build settings and Boost version 1.55.

Whenever a new lib for the benchmarks is added I usually run only the newly created benchmarks with a sample size of 400 milliseconds to get a quick result. I then follow this result up with a max resolution run (4294 milliseconds for 32 bit) of all the libs that usually takes ~4 hours. Unfortunately I have not ran the max resolution benchmark yet so I don't have the most accurate results for nod. The full benchmark will be rerun tonight and I will update the FT results once that is complete tomorrow.

Also I will pull down your most recent changes as well.

Edit: pulled down changes from nod master and have begun the full benchmark, should be able to update the results tomorrow morning.

NoAvailableAlias commented 9 years ago

Interesting. Not only did the score increase, but nod now sits at the top. The problem now becomes why is it so fast and is there a problem with the benchmarks. Unless you changed nod to default to the single threaded policy, I'm going to be digging into this discrepancy.

+ -------------------------------------------------------------------------------- +
| Library              |  construct |  destruct |  connect |  emission |  combined |
+ -------------------------------------------------------------------------------- +
| * fr00b0 nod         |  820444    |  14872    |  29765   |  174400   |  6073     |
| EvilTwin Observer    |  916892    |  8027     |  4107    |  61124    |  3328     |
| jeffomatic jl_signal |  312303    |  19327    |  345145  |  265978   |  35390    |
| EvilTwin Fork        |  660942    |  34146    |  23174   |  105509   |  7925     |
| pbhogan Signals      |  417216    |  54466    |  23072   |  257013   |  14717    |
| Yassi                |  489976    |  4039     |  1525    |  265313   |  1559     |
| joanrieu signal11    |  574679    |  27260    |  18624   |  44884    |  8131     |
| amc522 Signal11      |  547680    |  12258    |  12896   |  85169    |  4863     |
| mwthinker Signal     |  392750    |  28307    |  9313    |  217309   |  8416     |
| supergrover sigslot  |  15667     |  6382     |  9496    |  97951    |  3163     |
| * winglot Signals    |  23158     |  3032     |  7914    |  88756    |  5991     |
| Nano-signal-slot     |  44321     |  26996    |  15124   |  33202    |  4615     |
| * neosigslot         |  61134     |  12020    |  7721    |  14720    |  1453     |
| Boost Signals        |  46426     |  7351     |  1575    |  11831    |  1838     |
| * Boost Signals2     |  25534     |  8269     |  7501    |  21546    |  621      |
+ -------------------------------------------------------------------------------- +

(MSVC seriously hurts the construct time of signal implementations that use std::map or std::list because MSVC implementation of those data structures perform dynamic allocation on construction) Also, the full benchmark run took way longer than expected (almost 12 hours) so I will have to update the readme information after work.

fr00b0 commented 9 years ago

How is the table sorted? I was under the impression that higher scores are better, and nod does not have the highest score in any of the benchmark columns. When I'm running the benchmark I end up with nod somehere in the middle.

As long as your benchmark is using nod::signal<> it should be the multithread policy that is used. nod::unsafe_signal<> is the equivalent type for the single threaded policy.

NoAvailableAlias commented 9 years ago

The table is sorted by the sum of the columns descending. I'm not sure how this is possible. The benchmark algorithms themselves were recently changed to be completely generic and shielded from any single signal implementation.

fr00b0 commented 9 years ago

It's highly unlikely that nod should be the fastest of the libraries. It does mutex locking with the multithreaded policy, and I know that some of the other libraries go to great length to provide good performance.

NoAvailableAlias commented 9 years ago

Not only that but the results have deviated greatly since the last full benchmark results were posted. I'm going to rerun the full benchmark with an input of 4000 milliseconds to see if the problem is an overflow with the elapsed time before I start investigating further.

4294 should be the maximum time in milliseconds for 32 bit architectures, but it could be possible that the elapsed time variables are overflowing right before satisfying the stop condition.

NoAvailableAlias commented 9 years ago

That is exactly what was occurring. Will make note of that particular detail (don't even get close to overflow). Here are the results for 4000 millseconds, probably more like what you are seeing.

+ -------------------------------------------------------------------------------- +
| Library              |  construct |  destruct |  connect |  emission |  combined |
+ -------------------------------------------------------------------------------- +
| jeffomatic jl_signal |  123557    |  9645     |  44346   |  39051    |  6352     |
| Yassi                |  149941    |  2250     |  1610    |  37176    |  845      |
| amc522 Signal11      |  125020    |  5661     |  3730    |  32403    |  2092     |
| mwthinker Signal     |  109900    |  5181     |  1857    |  38771    |  1306     |
| pbhogan Signals      |  107762    |  5195     |  4859    |  31997    |  2275     |
| * fr00b0 nod         |  95490     |  4150     |  2642    |  30786    |  1493     |
| EvilTwin Fork        |  105054    |  3801     |  2112    |  18928    |  1252     |
| EvilTwin Observer    |  96725     |  2517     |  1210    |  19295    |  813      |
| joanrieu signal11    |  90288     |  6119     |  4336    |  7796     |  1744     |
| supergrover sigslot  |  11086     |  1395     |  2244    |  38451    |  759      |
| Nano-signal-slot     |  12449     |  4060     |  3761    |  29266    |  1655     |
| * winglot Signals    |  5780      |  2041     |  2379    |  31247    |  900      |
| * neosigslot         |  13778     |  2541     |  2343    |  6345     |  928      |
| Boost Signals        |  7844      |  1653     |  571     |  4483     |  354      |
| * Boost Signals2     |  6367      |  1821     |  865     |  3118     |  459      |
+ -------------------------------------------------------------------------------- +
fr00b0 commented 9 years ago

Yes, this table is very similar to what I have been exepriencing when running your benchmarks.