thesadrogue / TheSadRogue.Primitives

A collection of primitive data structures for working with a 2-dimensional grid.
MIT License
21 stars 6 forks source link

Test a wider range of Point hashes. #100

Closed tommyettinger closed 1 year ago

tommyettinger commented 1 year ago

I left comments in each commit that might be helpful. The important thing is that the simplest hash here, BareMinimum, gets very close to KnownSize hashing in terms of speed, and like KnownSizeHasher, it is not very random-seeming. I have the speed benchmarks for PointDictionaryAdd:

|                            Method | Size |         Mean |      Error |     StdDev |       Median |
|---------------------------------- |----- |-------------:|-----------:|-----------:|-------------:|
|                 CurrentPrimitives |   10 |     2.473 us |  0.0290 us |  0.0257 us |     2.479 us |
|                   OriginalGoRogue |   10 |     3.172 us |  0.0610 us |  0.0571 us |     3.172 us |
|                         KnownSize |   10 |     2.735 us |  0.0208 us |  0.0173 us |     2.736 us |
|                        KnownRange |   10 |     2.811 us |  0.0560 us |  0.1464 us |     2.759 us |
|              RosenbergStrongBased |   10 |     3.144 us |  0.0327 us |  0.0273 us |     3.141 us |
| RosenbergStrongBasedMinusMultiply |   10 |     2.610 us |  0.0511 us |  0.0502 us |     2.599 us |
|               RosenbergStrongPure |   10 |     2.363 us |  0.0213 us |  0.0199 us |     2.359 us |
|                        CantorPure |   10 |     2.565 us |  0.0437 us |  0.0788 us |     2.576 us |
|                       BareMinimum |   10 |     2.338 us |  0.0465 us |  0.0571 us |     2.313 us |
|                       MultiplySum |   10 |     2.590 us |  0.0446 us |  0.0417 us |     2.602 us |
|                 CurrentPrimitives |   50 |    72.201 us |  1.4406 us |  1.4149 us |    72.200 us |
|                   OriginalGoRogue |   50 |    74.035 us |  1.4328 us |  2.1881 us |    73.053 us |
|                         KnownSize |   50 |    51.915 us |  1.0200 us |  1.8906 us |    50.856 us |
|                        KnownRange |   50 |    53.373 us |  1.0602 us |  1.5540 us |    54.220 us |
|              RosenbergStrongBased |   50 |    79.562 us |  1.5893 us |  1.8303 us |    78.729 us |
| RosenbergStrongBasedMinusMultiply |   50 |    77.804 us |  1.5400 us |  1.5125 us |    78.168 us |
|               RosenbergStrongPure |   50 |    61.285 us |  1.1992 us |  2.7553 us |    62.390 us |
|                        CantorPure |   50 |    65.949 us |  0.8372 us |  0.7422 us |    65.753 us |
|                       BareMinimum |   50 |    57.722 us |  0.7113 us |  0.5940 us |    57.565 us |
|                       MultiplySum |   50 |    64.055 us |  0.8987 us |  0.7504 us |    63.891 us |
|                 CurrentPrimitives |  100 |   379.739 us |  7.5152 us |  7.3809 us |   376.895 us |
|                   OriginalGoRogue |  100 |   385.661 us |  6.4697 us |  6.0518 us |   388.087 us |
|                         KnownSize |  100 |   233.832 us |  1.8599 us |  1.7398 us |   234.354 us |
|                        KnownRange |  100 |   238.968 us |  2.1470 us |  1.9033 us |   238.501 us |
|              RosenbergStrongBased |  100 |   365.246 us |  7.2428 us |  6.7749 us |   362.101 us |
| RosenbergStrongBasedMinusMultiply |  100 |   366.542 us |  7.2905 us | 16.0028 us |   370.190 us |
|               RosenbergStrongPure |  100 |   268.963 us |  5.3067 us |  6.7112 us |   270.808 us |
|                        CantorPure |  100 |   278.829 us |  5.5601 us |  5.9493 us |   276.739 us |
|                       BareMinimum |  100 |   249.940 us |  4.8344 us |  4.9645 us |   247.455 us |
|                       MultiplySum |  100 |   277.260 us |  5.4827 us |  5.8664 us |   274.367 us |
|                 CurrentPrimitives |  175 | 1,138.365 us |  6.2221 us | 11.5330 us | 1,136.530 us |
|                   OriginalGoRogue |  175 | 1,178.871 us |  8.9898 us |  8.4090 us | 1,178.890 us |
|                         KnownSize |  175 |   668.405 us |  5.8477 us |  5.1839 us |   667.583 us |
|                        KnownRange |  175 |   676.898 us |  8.2646 us |  7.7307 us |   678.762 us |
|              RosenbergStrongBased |  175 | 1,218.810 us | 10.7955 us | 10.0981 us | 1,215.942 us |
| RosenbergStrongBasedMinusMultiply |  175 | 1,179.085 us | 10.0739 us |  8.9303 us | 1,179.886 us |
|               RosenbergStrongPure |  175 |   760.599 us |  6.3940 us |  4.9920 us |   760.856 us |
|                        CantorPure |  175 |   834.900 us |  9.0003 us |  8.4189 us |   834.617 us |
|                       BareMinimum |  175 |   697.848 us | 13.1751 us | 12.3240 us |   696.240 us |
|                       MultiplySum |  175 |   785.984 us | 13.6117 us | 12.7324 us |   785.394 us |
|                 CurrentPrimitives |  256 | 2,934.451 us | 27.1371 us | 25.3841 us | 2,925.226 us |
|                   OriginalGoRogue |  256 | 3,019.435 us | 17.9357 us | 27.3897 us | 3,017.546 us |
|                         KnownSize |  256 | 1,660.725 us | 26.2639 us | 24.5672 us | 1,667.402 us |
|                        KnownRange |  256 | 1,681.399 us | 32.6702 us | 34.9567 us | 1,679.994 us |
|              RosenbergStrongBased |  256 | 3,069.717 us | 23.6231 us | 20.9413 us | 3,073.564 us |
| RosenbergStrongBasedMinusMultiply |  256 | 3,050.202 us | 33.6965 us | 31.5197 us | 3,051.733 us |
|               RosenbergStrongPure |  256 | 1,896.973 us | 24.1849 us | 26.8815 us | 1,895.791 us |
|                        CantorPure |  256 | 2,168.527 us | 42.9730 us | 44.1301 us | 2,166.936 us |
|                       BareMinimum |  256 | 1,759.347 us | 24.8700 us | 23.2635 us | 1,747.685 us |
|                       MultiplySum |  256 | 2,210.841 us | 25.1394 us | 20.9926 us | 2,215.512 us |
tommyettinger commented 1 year ago

So some key things to consider: Will users ever have negative x and/or y, and if so, how far below 0 can they go? Some very unusual grid shapes could be trouble for some of these, even if x and y are always non-negative. Cantor forms diagonal stripes as it assigns higher and higher numbers to grid cells; Rosenberg-Strong forms square 'L'; shapes. Grids that are very tall and thin, or very short and wide, might make those shapes require much higher values for even a small-ish number of cells. This might not be a problem, because .NET's Dictionary uses a prime modulus rather than just a bitmask or shift... I have a feeling the main requirement is just that the hash be fast, since Dictionary takes care of the rest pretty thoroughly. BareMinimum with a left rotation by 8 is probably my suggestion for now, but rotating left by 16 should also be fine. There are bitwise rotation methods in some of the more recent .NET versions, and I don't know if you can use them, or if the JIT compiler can figure out that (x << shift | x >> 32 - shift) is a bitwise rotation operation (making it about as fast as an addition).

tommyettinger commented 1 year ago

OK, here's the latest benchmark's results, on equally-often positive and negative x and y:

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19045
Intel Core i7-10750H CPU 2.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=7.0.102
  [Host]     : .NET Core 6.0.13 (CoreCLR 6.0.1322.58009, CoreFX 6.0.1322.58009), X64 RyuJIT
  DefaultJob : .NET Core 6.0.13 (CoreCLR 6.0.1322.58009, CoreFX 6.0.1322.58009), X64 RyuJIT

|                            Method | Size |         Mean |      Error |     StdDev |       Median |
|---------------------------------- |----- |-------------:|-----------:|-----------:|-------------:|
|                 CurrentPrimitives |   10 |     2.548 us |  0.0504 us |  0.1197 us |     2.494 us |
|                   OriginalGoRogue |   10 |     2.624 us |  0.0515 us |  0.1076 us |     2.658 us |
|                         KnownSize |   10 |     2.391 us |  0.0462 us |  0.0678 us |     2.374 us |
|                        KnownRange |   10 |     2.418 us |  0.0483 us |  0.0834 us |     2.424 us |
|              RosenbergStrongBased |   10 |     2.688 us |  0.0529 us |  0.0588 us |     2.668 us |
| RosenbergStrongBasedMinusMultiply |   10 |     2.677 us |  0.0480 us |  0.0449 us |     2.698 us |
|               RosenbergStrongPure |   10 |     2.376 us |  0.0446 us |  0.0458 us |     2.367 us |
|                        CantorPure |   10 |     2.495 us |  0.0492 us |  0.0547 us |     2.472 us |
|                       BareMinimum |   10 |     2.410 us |  0.0477 us |  0.0603 us |     2.448 us |
|                       MultiplySum |   10 |     2.457 us |  0.0482 us |  0.0737 us |     2.507 us |
|                 CurrentPrimitives |   50 |    69.881 us |  1.3711 us |  1.9220 us |    70.744 us |
|                   OriginalGoRogue |   50 |    73.764 us |  1.4452 us |  1.6063 us |    74.315 us |
|                         KnownSize |   50 |    70.104 us |  1.3648 us |  2.0006 us |    69.372 us |
|                        KnownRange |   50 |    67.817 us |  0.4064 us |  0.3173 us |    67.873 us |
|              RosenbergStrongBased |   50 |    80.669 us |  1.0561 us |  0.8245 us |    80.393 us |
| RosenbergStrongBasedMinusMultiply |   50 |    80.356 us |  1.5866 us |  2.5621 us |    79.330 us |
|               RosenbergStrongPure |   50 |    75.738 us |  1.4949 us |  2.6957 us |    76.809 us |
|                        CantorPure |   50 |    71.955 us |  0.8454 us |  0.8303 us |    72.036 us |
|                       BareMinimum |   50 |    69.732 us |  1.3894 us |  2.2436 us |    69.691 us |
|                       MultiplySum |   50 |    68.396 us |  0.1718 us |  0.1607 us |    68.348 us |
|                 CurrentPrimitives |  100 |   335.781 us |  5.8695 us |  6.2803 us |   335.135 us |
|                   OriginalGoRogue |  100 |   350.627 us |  6.9640 us |  8.0197 us |   353.581 us |
|                         KnownSize |  100 |   335.773 us |  6.6714 us |  7.6828 us |   333.326 us |
|                        KnownRange |  100 |   342.423 us |  6.7300 us |  9.2121 us |   348.710 us |
|              RosenbergStrongBased |  100 |   391.549 us |  7.8304 us | 10.7184 us |   385.444 us |
| RosenbergStrongBasedMinusMultiply |  100 |   392.421 us |  7.7475 us | 10.0739 us |   395.919 us |
|               RosenbergStrongPure |  100 |   363.196 us |  7.2237 us | 10.8121 us |   357.511 us |
|                        CantorPure |  100 |   353.982 us |  6.9095 us | 10.7572 us |   357.089 us |
|                       BareMinimum |  100 |   327.382 us |  5.5715 us |  6.1927 us |   326.908 us |
|                       MultiplySum |  100 |   329.329 us |  6.5413 us | 11.1077 us |   330.779 us |
|                 CurrentPrimitives |  175 | 1,095.239 us | 19.2533 us | 18.0096 us | 1,090.769 us |
|                   OriginalGoRogue |  175 | 1,153.031 us | 22.7875 us | 25.3283 us | 1,152.057 us |
|                         KnownSize |  175 | 1,156.251 us | 15.1020 us | 14.1264 us | 1,157.790 us |
|                        KnownRange |  175 | 1,129.606 us | 11.8781 us | 11.1108 us | 1,131.388 us |
|              RosenbergStrongBased |  175 | 1,451.467 us |  8.9499 us |  8.3717 us | 1,452.237 us |
| RosenbergStrongBasedMinusMultiply |  175 | 1,410.826 us | 12.0894 us | 10.7170 us | 1,413.116 us |
|               RosenbergStrongPure |  175 | 1,285.783 us |  7.9979 us |  7.4813 us | 1,288.821 us |
|                        CantorPure |  175 | 1,251.229 us |  6.6537 us |  5.5561 us | 1,250.956 us |
|                       BareMinimum |  175 | 1,088.421 us | 13.0138 us | 12.1731 us | 1,089.591 us |
|                       MultiplySum |  175 | 1,093.047 us | 11.9116 us | 11.1421 us | 1,091.695 us |
|                 CurrentPrimitives |  256 | 2,924.071 us | 33.0242 us | 30.8908 us | 2,914.341 us |
|                   OriginalGoRogue |  256 | 2,970.857 us | 21.7161 us | 20.3133 us | 2,968.653 us |
|                         KnownSize |  256 | 2,923.818 us | 18.7399 us | 17.5293 us | 2,926.333 us |
|                        KnownRange |  256 | 2,935.779 us | 42.8776 us | 38.0098 us | 2,922.961 us |
|              RosenbergStrongBased |  256 | 3,978.849 us | 27.0928 us | 25.3426 us | 3,979.130 us |
| RosenbergStrongBasedMinusMultiply |  256 | 3,966.713 us | 43.3875 us | 38.4619 us | 3,955.923 us |
|               RosenbergStrongPure |  256 | 3,533.416 us | 24.9279 us | 23.3176 us | 3,525.887 us |
|                        CantorPure |  256 | 3,488.997 us | 27.4341 us | 25.6619 us | 3,482.179 us |
|                       BareMinimum |  256 | 2,821.189 us | 29.4691 us | 27.5655 us | 2,823.807 us |
|                       MultiplySum |  256 | 2,839.916 us | 34.9376 us | 30.9712 us | 2,830.097 us |
tommyettinger commented 1 year ago

The main thing to note here: Because of negative inputs being more likely to collide with positive ones when the hash doesn't support negative x or y very well, Rosenberg-Strong and Cantor-based hashes start to take much more time with larger Dictionary sizes like 256 (20% to 35% more than most of the others, I think). BareMinimum still does surprisingly very well, as does MultiplySum. MultiplySum is also amenable to random hashing if you want to try that; you just choose two large odd numbers pseudo-randomly when some condition is detected that the hash may be facing trouble. Pseudo-random hashing is what jdkgdxds uses for its Set and Map classes, as a way of mitigating worst-case performance issues. If you find a case that does call for random hashing, my suggestion is to have a medium-large array of tested, known-good multipliers that the algorithm selects without modification, with probably a different (non-overlapping) array for x and for y.