caendesilva commented 2 years ago

Benchmarks using Laravel 9, PHP 8, (16 cores @ 3.60GHz)

About:

Sorry for any bad formatting. This was just a quick test I did and thought I would share :)

Test 1: Initializing generator outside the loop

$zooIdGen = RobThree\HumanoID\HumanoIDs::zooIdGenerator();
while ($i <= 1000) {
  $zooId = $zooIdGen->create($i);
  echo $zooId . '<br>';
  $i++;
}

Result: 10 000 loops took 1.3485219478607 seconds Average: 0.13485219 milliseconds per iteration Peak CPU utilization: ~12%

Test 2: Initializing the generator in each loop

while ($i <= 1000) {
  $zooIdGen = RobThree\HumanoID\HumanoIDs::zooIdGenerator();
  $zooId = $zooIdGen->create($i);
  echo $zooId . '<br>';
  $i++;
}

Result: Crashed after 615 iterations Peak CPU utilazation: ~27%

Eventually finished with the following results Result: 1000 loops took 22.121620178223 seconds Average: 22.12162018 milliseconds per iteration

Test 3: Initializing the generator as a singleton in the service container

// In AppServiceProvider
$this->app->singleton('HumanoIDs', function ($app) {
  return new HumanoIDs($app->make(HumanoIDs::class));
});

// Benchmark
while ($i <= 10000) {
  $zooId = app('HumanoIDs')->generator->create($i);
  echo $zooId . '<br>';
  $i++;
}

Result: 10000 loops took 22.320996999741 seconds Average: 2.2320997 milliseconds per iteration Peak CPU utilazation: ~21%

This actually suprised me as I thought the result between test 1 and 3 would be much closer. This makes me think that I need to bind the actual generator to the service container instead.

Test 4: Initializing the generator in the service container by binding the generator method

// In AppServiceProvider
$this->app->singleton('HumanoIDGenerator', function ($app) {
  return new HumanoIDGenerator($app->make(HumanoIDGenerator::class));
});

// Intermediary class:
class HumanoIDGenerator
{
  public \RobThree\HumanoID\HumanoID $generator;

  public function __construct()
  {
    $this->generator = \RobThree\HumanoID\HumanoIDs::zooIdGenerator();
  }
}

// Benchmark
while ($i <= 10000) {
  $zooId = app('HumanoIDGenerator')->generator->create($i);
  echo $zooId . '<br>';
  $i++;
}

Result: 10000 loops took 1.5630679130554 seconds Average: 0.15630679 milliseconds per iteration Peak CPU utilazation: ~12% for a short spike

This is just a tiny tiny bit slower than the first test, which is just what I initially expected.

Conclusion

When using Laravel, make sure to bind the actual generator to the service container and not just the main class.

MueR commented 2 years ago

Without in-depth knowledge of Laravel (I'm a Symfony developer), test cases 2 and 3 are, excuse my french, stupid. You are creating new instances of a singleton in a loop and expect it to perform the same as a singleton where you simply request a value. Your test cases 1 and 4 prove that the code itself is performing well within acceptable ranges.

Wontfix imo.

caendesilva commented 2 years ago

This wasn't really an bug/issue. Just wanted to share my benchmarks and a snippet for Laravel developers to properly bind into the service container :)

Edit: I see now that there is a discussion page, which I should probably have posted this in instead.

RobThree commented 2 years ago

Benchmarks are appreciated. Case 2 doesn't make sense though; you should instantiate a generator only once and keep it around as long as possible as mentioned (twice 😉) in the documentation.

I do think, though, that the generators in the HumanoIDs class could be implemented in such a way that the same instance is always returned (singleton). I'll fix that.

Not sure what's going on in case 3; I'm not familiar with Laravel.

caendesilva commented 2 years ago

Yeah there is absolutely no use case where one would use case 2. I just wanted to benchmark the actual instantiation time as the documentation mentioned it was an expensive operation. It was actually since it was specified so clearly in the documentation that I got curios and decided to benchmark it to see just how expensive it was.

Basically what I'm doing with the Laravel case is that I am binding the HumanoIDs class to the Laravel App instance as a way to keep the HumanoIDs instance alive for as long as possible.

RobThree commented 2 years ago

@caendesilva Does this make sense to you? Suggestions or improvements (besides my two fixes)?

caendesilva commented 2 years ago

I've only ever worked with the singleton pattern in the Laravel context where it's abstracted, so I don't have any practical knowledge in actually implementing them so I could be wrong, but based on the theory of singletons it looks it's implemented the right way.

mallardduck is more experienced than I am and may be a better authority on this though

caendesilva commented 2 years ago

@RobThree I think that you could use isset() instead of checking for and allowing null. Might make it more readable. Not sure if there is a practical difference, and I don't think it matters much anyway but thought I would mention it.

RobThree commented 2 years ago

@RobThree I think that you could use isset() instead of checking for and allowing null. Might make it more readable. Not sure if there is a practical difference, and I don't think it matters much anyway but thought I would mention it.

I'll have to look at @mallardduck and/or @MueR on this matter.

About your benchmark: aren't your echo's in the loop also counted? I can imagine this makes a difference in the final results? Not sure if it would be of any significance, but hey...

caendesilva commented 2 years ago

About your benchmark: aren't your echo's in the loop also counted? I can imagine this makes a difference in the final results? Not sure if it would be of any significance, but hey...

They are probably being counted. I always add echo's to benchmarks to make sure that the generated variable is actually used. I've found that if you don't have that the results can be rather unreliable.

This test was also mainly focused on comparing results for different ways to initialize the main class as I wanted to see the impact of the CPU costs mentioned in the readme. And since all loops have the same echo statement any additional processing time is the same for all the benchmark tests so for the purpose of comparing relative speeds between implementation metods it does what I wanted it to.

The test is also pretty crude and running it through a browser and Laravel have a much higher impact. Again, that's fine for getting the information that I specifically needed. A more fair test to measure the performance of the actual script would be to use a minimal install and run it on several different pieces of hardware, operating systems, and PHP versions. Could be fun to try, but not sure if it's needed since it is already so fast when having an instantiated instance of the class. Data is always fun to have though.

Thank you for bringing it up though, as it gave me a chance to explain my thought process which I always think is useful!

RobThree commented 2 years ago

I just added some benchmarks for the create() method using PHPBench and I think the results are rather interesting:

⚠️ Note: These benchmarks were ran with xdebug enabled and opcache disabled. See here for much more benchmarks with/without xdebug/opcache.

$ ./vendor/bin/phpbench run tests/Benchmark --report=aggregate
PHPBench (1.2.5) running benchmarks... 
with configuration file: /Users/rob/Code/HumanoID/phpbench.json
with PHP version 8.1.5, xdebug ✔, opcache ❌

\RobThree\HumanoID\Tests\Benchmark\HumanoIDCreateBench

    benchCreate # 0,0.......................I4 - Mo101,182.284ops/s (±0.54%)
    benchCreate # 1,0.......................I4 - Mo101,164.318ops/s (±0.76%)
    benchCreate # 2,0.......................I4 - Mo100,581.915ops/s (±1.52%)
    benchCreate # 0,1.......................I4 - Mo100,995.104ops/s (±0.31%)
    benchCreate # 1,1.......................I4 - Mo86,917.928ops/s (±0.72%)
    benchCreate # 2,1.......................I4 - Mo101,122.998ops/s (±0.14%)
    benchCreate # 0,2.......................I4 - Mo58,584.259ops/s (±1.28%)
    benchCreate # 1,2.......................I4 - Mo37,540.056ops/s (±0.72%)
    benchCreate # 2,2.......................I4 - Mo37,549.949ops/s (±0.78%)
    benchCreate # 0,3.......................I4 - Mo36,238.651ops/s (±0.49%)
    benchCreate # 1,3.......................I4 - Mo18,434.524ops/s (±0.17%)
    benchCreate # 2,3.......................I4 - Mo18,876.591ops/s (±0.34%)

Subjects: 1, Assertions: 0, Failures: 0, Errors: 0
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode             | rstdev |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 6.987mb  | 101,182.284ops/s | ±0.54% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 4.761mb  | 101,164.318ops/s | ±0.76% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 4.540mb  | 100,581.915ops/s | ±1.52% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 6.987mb  | 100,995.104ops/s | ±0.31% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 4.761mb  | 86,917.928ops/s  | ±0.72% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 4.540mb  | 101,122.998ops/s | ±0.14% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 6.987mb  | 58,584.259ops/s  | ±1.28% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 4.761mb  | 37,540.056ops/s  | ±0.72% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 4.540mb  | 37,549.949ops/s  | ±0.78% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 6.987mb  | 36,238.651ops/s  | ±0.49% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 4.761mb  | 18,434.524ops/s  | ±0.17% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 4.540mb  | 18,876.591ops/s  | ±0.34% |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+

I ran these benchmarks on my Macbook Air (M1, 2020, 16GB); the benchmarks seem to peg only one core (and not even at 100%). I will run these later on my Windows i9 10900X and maybe even have a go on an old Xeon e3-1245v3 linux VM.

About these results

I'm (also) new to PHPBench so I haven't figured out (yet) how to make the above report a bit more readable, especially regarding the sets(). What the above report shows, in essence, is for 3 HumanoID generators being benchmarked for several magnitudes of int -> string conversions. The generators are the two out-of-the-box zoo and space generators and a custom generator. Next are the ranges of 0, 10, 1000 and 1000000 respectively. So the above results should read something like this:

+----------------+-------+-----+----------+---------------+
| set            | revs  | its | mem_peak | mode          |
+----------------+-------+-----+----------+---------------+
| zoo   ,0       | 10000 | 5   | 6.987mb  | 101,182 ops/s |
| space ,0       | 10000 | 5   | 4.761mb  | 101,164 ops/s |
| custom,0       | 10000 | 5   | 4.540mb  | 100,581 ops/s |
| zoo   ,10      | 10000 | 5   | 6.987mb  | 100,995 ops/s |
| space ,10      | 10000 | 5   | 4.761mb  |  86,917 ops/s |
| custom,10      | 10000 | 5   | 4.540mb  | 101,122 ops/s |
| zoo   ,1000    | 10000 | 5   | 6.987mb  |  58,584 ops/s |
| space ,1000    | 10000 | 5   | 4.761mb  |  37,540 ops/s |
| custom,1000    | 10000 | 5   | 4.540mb  |  37,549 ops/s |
| zoo   ,1000000 | 10000 | 5   | 6.987mb  |  36,238 ops/s |
| space ,1000000 | 10000 | 5   | 4.761mb  |  18,434 ops/s |
| custom,1000000 | 10000 | 5   | 4.540mb  |  18,876 ops/s |
+----------------+-------+-----+----------+---------------+

The first set (where the id is 'fixed' to a single, 0, id) shows that there's no real difference; all three generators produce the same HumanoID at the same rate. At about 100,000 per second (which is quite mind blowing IMO).

Where it gets interesting is the next sets. My first instincts were that the larger wordsets would be slower, but it actually turns out to be the opposite. Yes, for generating a single HumanoID the smaller wordsets will be (much) more efficient since you don't need to load in a huge dataset and prepare it for use. But as you're generating more and more HumanoID's the larger wordsets will become (much) more efficient. This is because the resulting HumanoID can be much shorter because there are a lot more 'digits' (words) available. For a small wordset a large integer may convert to red-blue-pink-yellow-sad-cow for example whereas when you use a huge wordset that same integer may convert into smart-lobster, which means less iterations, less string concatenations etc.

Does this mean that large wordsets should be preferred? Well, yes, no, it... depends. There's a tradeoff; when you generate a single HumanoID in a request then you're better of with a smal(ler) wordset so you don't incur the cost of loading a huge wordset. But you also don't want the wordset to be too small since that will result in longer HumanoID's. On the other end of the spectrum: if you're generating huge batches of HumanoID's a huge wordset will be more efficient since the resulting strings will be much shorter whereas a smal(ler) wordset will result in slower operations due to longer resulting strings.

Since I'm primarily a .Net developer I do have a somewhat different take on this matter I guess; In ".Net world" I would load the wordset at application start, once, and be done with it. Then the size of the wordset wouldn't really matter (unless there are some tight memory restrictions maybe) and the cost of loading the wordset would only be incurred once. But since PHP doesn't really have an "application scope" (as far as I'm aware) the wordset would have to be loaded every request (be it from disk, memory cache, database, whatever). I don't think you can "keep a HumanoID-generator around" across multiple requests so it can be (re)used every time. But I may be mistaken and would be happy to hear how to get around this "limitation".

Conclusion

So now what? Do you need a huge wordset or a small one? Well, as you can read above: it depends. I think the only conclusion we can make is: use a decently sized wordset for your application, make sure there's enough 'headroom' to grow but don't go overboard loading huge wordsets of multiple megabytes.

In respect to speed/performance: you should be easily able to generate over 10,000 HumanoID's per second (on a Macbook M1) and, with some effort and a bit more CPU power, I think an order of magnitude, maybe 2 even, more should be achievable as well on a single machine using multiple threads and other optimisations. So, depending on context ofcourse, I think generating a HumanoID is pretty negligible in most situations. Thinks like Opcache and whathaveyou will skew things even more ofcourse.

What's next

Well, I'd like to test variations where different WordOptions are used in combination with different (or no) separators. And whenever #5 is implemented I'm sure there will be some more interesting benchmarks to be done. But, for now, what I'm more interested in is benchmarking the parse() method. I'll try to create a basic benchmark for this too sometime soon, but maybe someone beats me to it. This will involve a little more thought and setup to be put in since we'd first need to get a (huge) bunch of HumanoID's to be parsed from somewhere (they can easily be generated before running the test but since I'd like to benchmark this, too, with small, medium and large wordsets it'll be a little heavier on memory etc.).

Offtopic:

I am new to PHPBenchmark, so it's probably me, but I do think a few... interesting choices were made. For example: why would I care to 3 decimal places how many ops/s were run? Why aren't the numbers right-aligned? Although the documentation, at first glance, seems nice, I found it hard to find answers to questions like "how do I format the numbers in the results", "how do I show the actual values used for a test" (in the sets column for example), "how do I show more useful names in the benchmark/subject columns", "how many bytes were actually allocated for each iteration instead of an overall measure of total memory use" etc.

caendesilva commented 2 years ago

Okay wow this is extensive, so breaking it down both to give feedback and for me to understand it better. I've never used PHPBench before so cool to learn!

At about 100,000 per second (which is quite mind blowing IMO).

Holy cow! That is incredible

I also think it's interesting that larger word sets can be faster. It makes full sense of course, given your explanation but it's something I'm not sure I would have spontaneously assumed without these benchmarks.

But since PHP doesn't really have an "application scope" (as far as I'm aware) the wordset would have to be loaded every request (be it from disk, memory cache, database, whatever). I don't think you can "keep a HumanoID-generator around" across multiple requests so it can be (re)used every time. But I may be mistaken and would be happy to hear how to get around this "limitation".

This is a really interesting scope. I don't actually know exactly how the PHP lifecycle works and if we can persist something in a memory cache or similar. @mallardduck might know? Hope it's alright that I pinged you.

I think from this thread we can conclude that the HumanoID-integer conversion is mind-blowingly fast already, so when considering any performance improvements they should be focused on ways to keep the generator instance around, if that is possible, as it is something I'd be really interested in learning.

Offtopic:

These are interesting concerns, I've never used PHPBench so I can't answer any of them, but had a quick glance on the readme and I see that they have report outputting as Markdown and HTML. Thinking that Markdown format could me useful for us here? Markdown is pretty easy to reformat using multi-cursor editing in VSCode/Sublime.

Thank you for making this benchmark, it's really interesting! Great analysis as well :)

caendesilva commented 2 years ago

I had a thought, could results be affected in any way by using sequential ID's in the ranges? AFAIK PHP uses some black-magic trickery with variables that are similar (which is correlated to me needing to use echo statements). I could be extremely wrong about this though. Take this comment with a heap of salt.

Still, could be cool to test by pre-generating an array mapping test values, and then scrambling it.

RobThree commented 2 years ago

could results be affected in any way by using sequential ID's in the ranges

I'm not sure what you mean; I'm generating random ID's. If you're asking wether sequential ID's could perform different: possibly, likely even. I think it depends on how many integers are converted in the sequence, how that specific sequence converts to HumanoID's etc. since the number of iterations and calculations will differ. That is why I chose random ID's spread over a range to get a decent average to avoid results being skewed by optimal, or worst-case, cases. I think there are too many variables in play (size of wordset, sequence length, how efficiently the (specific) integers map to the wordset etc.) to make hard rules. But I do think we could describe the create() method in terms of Big-O notation and I guess that's close to O_{log n} or somewhere in that neighbourhood? Correct me if I'm wrong. But this is why I don't think it's worth testing all kind of micro-optimizations or variations that may or may not involve PHP using black magic trickery etc. I think we're benching for the general, run of the mill, usage here.

caendesilva commented 2 years ago

Oh, sorry, I misunderstood the benchmarking. I was using sequential input ID's, and assumed you did too. Good call on using random ID's!

RobThree commented 2 years ago

but had a quick glance on the readme and I see that they have report outputting as Markdown

Again, yes there's mention in the documentation of markdown, but HOW to do it is nowhere to be found in the docs. I sort-of expected something like --output=markdown from this but that results in:

No renderer configuration or service named "markdown" exists. Known configurations: "csv", known services: "console ", "delimited", "html"

So... yeah

RobThree commented 2 years ago

Anyway, I will be using this comment to post the results from a few different machines. Just for shits'n'giggles.

Macbook M1 2020 / PHP 8.1.5 / xdebug ✔, opcache ❌:

+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode             | rstdev |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 6.987mb  | 101,732.351ops/s | ±0.19% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 4.761mb  | 101,819.018ops/s | ±0.42% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 4.540mb  | 101,698.136ops/s | ±0.19% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 6.987mb  | 101,272.438ops/s | ±4.98% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 4.761mb  | 87,125.065ops/s  | ±1.57% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 4.540mb  | 101,291.283ops/s | ±0.14% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 6.987mb  | 59,188.797ops/s  | ±0.24% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 4.761mb  | 37,830.427ops/s  | ±0.25% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 4.540mb  | 37,590.875ops/s  | ±0.07% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 6.987mb  | 36,214.954ops/s  | ±0.13% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 4.761mb  | 18,369.065ops/s  | ±1.07% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 4.540mb  | 18,857.328ops/s  | ±0.33% |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+

Macbook M1 2020 / PHP 8.1.5 / xdebug ❌, opcache ❌:

+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode             | rstdev |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 6.682mb  | 357,966.420ops/s | ±0.25% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 4.455mb  | 353,948.122ops/s | ±0.57% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 4.234mb  | 357,998.571ops/s | ±0.40% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 6.682mb  | 358,896.251ops/s | ±0.52% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 4.455mb  | 305,371.010ops/s | ±0.52% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 4.234mb  | 355,554.962ops/s | ±0.35% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 6.682mb  | 208,591.164ops/s | ±0.25% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 4.455mb  | 131,375.717ops/s | ±0.82% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 4.234mb  | 131,978.690ops/s | ±0.32% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 6.682mb  | 126,751.553ops/s | ±0.32% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 4.455mb  | 63,858.228ops/s  | ±0.70% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 4.234mb  | 66,227.587ops/s  | ±0.27% |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+

Macbook M1 2020 / PHP 8.1.5 / xdebug ❌, opcache ✔:

+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode             | rstdev |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 3.443mb  | 379,830.594ops/s | ±0.94% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 1.336mb  | 379,692.443ops/s | ±0.31% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 1.315mb  | 382,114.299ops/s | ±0.94% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 3.443mb  | 380,130.045ops/s | ±0.23% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 1.336mb  | 324,100.630ops/s | ±0.42% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 1.315mb  | 379,479.276ops/s | ±0.95% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 3.443mb  | 223,198.023ops/s | ±0.46% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 1.336mb  | 142,190.323ops/s | ±0.53% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 1.315mb  | 141,080.993ops/s | ±0.35% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 3.443mb  | 135,476.582ops/s | ±0.34% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 1.336mb  | 69,336.787ops/s  | ±0.20% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 1.315mb  | 71,181.398ops/s  | ±0.32% |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+

Windows i9-10900X @3.7GHz / PHP 8.1.5 / xdebug ❌, opcache ❌:

+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode             | rstdev |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 6.712mb  | 258,612.845ops/s | ±2.29% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 4.485mb  | 257,324.884ops/s | ±5.29% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 4.265mb  | 260,409.434ops/s | ±1.71% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 6.712mb  | 260,758.548ops/s | ±2.65% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 4.485mb  | 222,673.408ops/s | ±2.68% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 4.265mb  | 260,095.815ops/s | ±1.18% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 6.712mb  | 150,962.080ops/s | ±1.55% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 4.485mb  | 98,532.361ops/s  | ±1.19% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 4.265mb  | 97,277.396ops/s  | ±2.15% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 6.712mb  | 94,044.416ops/s  | ±2.09% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 4.485mb  | 48,274.717ops/s  | ±1.01% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 4.265mb  | 49,549.676ops/s  | ±0.72% |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+

WSL2 Debian on Win / i9-10900X @3.7GHz / PHP 7.4.29 / xdebug ❌, opcache ❌:

+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode             | rstdev |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 6.502mb  | 376,417.329ops/s | ±1.71% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 4.386mb  | 381,603.913ops/s | ±1.85% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 4.174mb  | 380,858.911ops/s | ±1.76% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 6.502mb  | 381,899.318ops/s | ±2.39% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 4.386mb  | 326,291.920ops/s | ±3.63% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 4.174mb  | 376,370.835ops/s | ±1.77% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 6.502mb  | 222,462.970ops/s | ±1.46% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 4.386mb  | 142,371.769ops/s | ±1.55% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 4.174mb  | 139,354.829ops/s | ±0.82% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 6.502mb  | 134,130.487ops/s | ±1.01% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 4.386mb  | 67,908.638ops/s  | ±1.88% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 4.174mb  | 72,014.804ops/s  | ±0.54% |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+

WSL2 Debian on Win / i9-10900X @3.7GHz / PHP 8.1.5 / xdebug ✔, opcache ❌:

+---------------------+-------------+-----+-------+-----+----------+-----------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode            | rstdev |
+---------------------+-------------+-----+-------+-----+----------+-----------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 6.898mb  | 62,400.741ops/s | ±2.95% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 4.671mb  | 63,132.922ops/s | ±1.19% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 4.451mb  | 62,192.076ops/s | ±6.93% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 6.898mb  | 62,119.145ops/s | ±2.02% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 4.671mb  | 54,194.631ops/s | ±0.58% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 4.451mb  | 63,236.183ops/s | ±7.33% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 6.898mb  | 35,811.597ops/s | ±3.09% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 4.671mb  | 23,394.261ops/s | ±3.07% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 4.451mb  | 23,241.164ops/s | ±3.15% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 6.898mb  | 22,761.173ops/s | ±3.42% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 4.671mb  | 11,243.621ops/s | ±2.46% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 4.451mb  | 11,956.548ops/s | ±2.35% |
+---------------------+-------------+-----+-------+-----+----------+-----------------+--------+

WSL2 Debian on Win / i9-10900X @3.7GHz / PHP 8.1.5 / xdebug ❌, opcache ❌:

+---------------------+-------------+-----+-------+-----+----------+------------------+---------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode             | rstdev  |
+---------------------+-------------+-----+-------+-----+----------+------------------+---------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 6.594mb  | 395,485.154ops/s | ±1.52%  |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 4.368mb  | 404,264.668ops/s | ±2.29%  |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 4.147mb  | 410,465.835ops/s | ±1.80%  |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 6.594mb  | 395,688.743ops/s | ±1.73%  |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 4.368mb  | 350,201.222ops/s | ±2.27%  |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 4.147mb  | 407,192.534ops/s | ±3.44%  |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 6.594mb  | 234,446.451ops/s | ±1.06%  |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 4.368mb  | 149,123.929ops/s | ±14.93% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 4.147mb  | 149,825.771ops/s | ±0.76%  |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 6.594mb  | 143,820.017ops/s | ±1.33%  |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 4.368mb  | 73,775.864ops/s  | ±5.77%  |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 4.147mb  | 73,853.732ops/s  | ±10.62% |
+---------------------+-------------+-----+-------+-----+----------+------------------+---------+

WSL2 Debian on Win / i9-10900X @3.7GHz / PHP 8.1.5 / xdebug ❌, opcache ✔:

+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode             | rstdev |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 3.440mb  | 421,017.663ops/s | ±1.21% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 1.333mb  | 421,570.483ops/s | ±2.49% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 1.313mb  | 427,623.257ops/s | ±0.90% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 3.440mb  | 409,153.345ops/s | ±1.44% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 1.333mb  | 362,956.219ops/s | ±1.24% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 1.313mb  | 420,604.830ops/s | ±2.71% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 3.440mb  | 246,283.267ops/s | ±0.59% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 1.333mb  | 159,736.299ops/s | ±0.72% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 1.313mb  | 157,265.605ops/s | ±8.46% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 3.440mb  | 151,044.918ops/s | ±0.93% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 1.333mb  | 77,679.635ops/s  | ±0.77% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 1.313mb  | 80,198.735ops/s  | ±9.18% |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+

Debian VM / Xeon E3-1226v2 @3.3Ghz / PHP 8.1.5 / xdebug ✔, opcache ✔

+---------------------+-------------+-----+-------+-----+----------+-----------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode            | rstdev |
+---------------------+-------------+-----+-------+-----+----------+-----------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 3.442mb  | 48,281.169ops/s | ±1.32% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 1.353mb  | 48,128.458ops/s | ±2.31% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 1.333mb  | 47,287.697ops/s | ±1.97% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 3.442mb  | 48,262.129ops/s | ±1.38% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 1.353mb  | 41,187.034ops/s | ±1.34% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 1.333mb  | 48,478.621ops/s | ±0.60% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 3.442mb  | 28,236.326ops/s | ±2.92% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 1.353mb  | 17,033.768ops/s | ±6.40% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 1.333mb  | 17,850.121ops/s | ±3.21% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 3.442mb  | 17,541.164ops/s | ±0.60% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 1.353mb  | 8,813.912ops/s  | ±0.44% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 1.333mb  | 9,057.198ops/s  | ±1.97% |
+---------------------+-------------+-----+-------+-----+----------+-----------------+--------+

Debian VM / Xeon E3-1226v2 @3.3Ghz / PHP 8.1.5 / xdebug ❌, opcache ✔

+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode             | rstdev |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 3.435mb  | 298,358.009ops/s | ±0.65% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 1.329mb  | 298,419.045ops/s | ±6.30% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 1.309mb  | 299,664.722ops/s | ±1.96% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 3.435mb  | 295,381.920ops/s | ±2.34% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 1.329mb  | 257,759.247ops/s | ±1.82% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 1.309mb  | 301,517.939ops/s | ±0.61% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 3.435mb  | 176,360.324ops/s | ±2.13% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 1.329mb  | 113,560.906ops/s | ±1.55% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 1.309mb  | 112,580.863ops/s | ±1.45% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 3.435mb  | 108,314.841ops/s | ±1.55% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 1.329mb  | 54,451.406ops/s  | ±1.00% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 1.309mb  | 56,710.788ops/s  | ±2.52% |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+

caendesilva commented 2 years ago

but had a quick glance on the readme and I see that they have report outputting as Markdown

Again, yes there's mention in the documentation of markdown, but HOW to do it is nowhere to be found in the docs. I sort-of expected something like --output=markdown from this but that results in:

No renderer configuration or service named "markdown" exists. Known configurations: "csv", known services: "console ", "delimited", "html"

So... yeah

Oh, wow. That's... not great 🤦‍♂️

Well, the output we are getting now is good enough :)

caendesilva commented 2 years ago

Okay I downloaded and ran your benchmarking script but I'm getting different output.

I don't think we are using the same run command. I used ./vendor/bin/phpbench run tests/Benchmark. What are you using?

Edit: WOW 400, 000 ops/s? That's amazing.

RobThree commented 2 years ago

What are you using?

I'm using --report=aggregate argument.

caendesilva commented 2 years ago

I'm using --report=aggregate argument.

Thanks! Here is my benchmark:

Win10 Pro AMD Ryzen 7 @3.6GHz / PHP 8.0.13 xdebug ✔, opcache ❌

+---------------------+-------------+-----+-------+-----+----------+-----------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode            | rstdev |
+---------------------+-------------+-----+-------+-----+----------+-----------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 7.075mb  | 13,975.076ops/s | ±0.54% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 4.959mb  | 14,038.253ops/s | ±0.23% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 4.747mb  | 14,070.597ops/s | ±0.78% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 7.075mb  | 14,068.904ops/s | ±1.23% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 4.959mb  | 12,081.363ops/s | ±1.10% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 4.747mb  | 13,945.829ops/s | ±0.67% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 7.075mb  | 8,182.423ops/s  | ±0.94% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 4.959mb  | 5,243.521ops/s  | ±0.86% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 4.747mb  | 5,200.883ops/s  | ±0.87% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 7.075mb  | 5,028.790ops/s  | ±0.78% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 4.959mb  | 2,541.164ops/s  | ±0.57% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 4.747mb  | 2,593.760ops/s  | ±0.55% |
+---------------------+-------------+-----+-------+-----+----------+-----------------+--------+

So yeah, I need to disable xdebug for this benchmark. It's only using about 10% of the CPU. I don't know if that's normal or not.

WSL2 on AMD Ryzen 7 @3.6GHz / PHP 8.0.14, xdebug ❌, opcache ❌

+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| benchmark           | subject     | set | revs  | its | mem_peak | mode             | rstdev |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+
| HumanoIDCreateBench | benchCreate | 0,0 | 10000 | 5   | 6.746mb  | 122,142.604ops/s | ±1.31% |
| HumanoIDCreateBench | benchCreate | 1,0 | 10000 | 5   | 4.630mb  | 125,663.585ops/s | ±3.29% |
| HumanoIDCreateBench | benchCreate | 2,0 | 10000 | 5   | 4.418mb  | 123,117.026ops/s | ±5.41% |
| HumanoIDCreateBench | benchCreate | 0,1 | 10000 | 5   | 6.746mb  | 124,355.311ops/s | ±1.39% |
| HumanoIDCreateBench | benchCreate | 1,1 | 10000 | 5   | 4.630mb  | 110,128.233ops/s | ±4.17% |
| HumanoIDCreateBench | benchCreate | 2,1 | 10000 | 5   | 4.418mb  | 124,137.741ops/s | ±2.00% |
| HumanoIDCreateBench | benchCreate | 0,2 | 10000 | 5   | 6.746mb  | 90,270.773ops/s  | ±3.79% |
| HumanoIDCreateBench | benchCreate | 1,2 | 10000 | 5   | 4.630mb  | 61,946.688ops/s  | ±1.20% |
| HumanoIDCreateBench | benchCreate | 2,2 | 10000 | 5   | 4.418mb  | 62,297.178ops/s  | ±2.33% |
| HumanoIDCreateBench | benchCreate | 0,3 | 10000 | 5   | 6.746mb  | 59,498.028ops/s  | ±0.78% |
| HumanoIDCreateBench | benchCreate | 1,3 | 10000 | 5   | 4.630mb  | 33,619.979ops/s  | ±2.50% |
| HumanoIDCreateBench | benchCreate | 2,3 | 10000 | 5   | 4.418mb  | 34,053.603ops/s  | ±2.34% |
+---------------------+-------------+-----+-------+-----+----------+------------------+--------+

mallardduck commented 2 years ago

Hot dang are you gents on a blazing roll today! A whole lot to parse over and grok before I can reply to everything. Just want to hit on a few areas I've started playing with this morning (still before noon for me).

phpbench formatting

Sadly, this is an area that this tool is lacking in IMO. Especially when compared to similar tools like PHPunit you find similar features (provide datasets via method+annotation). In fact with PHPunit's implementation you directly see the value passed by default, OR if you set a key on your "data provider method".

In one of my projects I even went so far as to whip up a helper method on my base TestCase that extracts good labels for you. Which is hecking easy compared to the methods I've seen to accomplish similar with phpbench.

That said, as a means of solving for this while also being lazy, I often just opt to make individual Benchmark classes per unique case. So I may still create a common base (abstract) class for similar benchmarks to pull from. However in a case like this - each default generator would have it's own benchmark class.

Application persistence

But since PHP doesn't really have an "application scope" (as far as I'm aware) the wordset would have to be loaded every request (be it from disk, memory cache, database, whatever). I don't think you can "keep a HumanoID-generator around" across multiple requests so it can be (re)used every time. But I may be mistaken and would be happy to hear how to get around this "limitation". This is a really interesting scope. I don't actually know exactly how the PHP lifecycle works and if we can persist something in a memory cache or similar. @mallardduck might know? Hope it's alright that I pinged you.

(You gents can absolutely ping me any time.)

Rob is correct that with PHP we can't easily (at least out of the box at a language level) keep an instance of the generator around. It's also hard as PHP's many different SAPIs operate with OpCache differently at times too. So any general statement for one may be false for another.

That in mind, lets assume we're only concerned with the "modern" SAPI models for running PHP - that being PHP-FPM primarily, and similar Roadrunner, CLI+Swoole, etc. For any of these, then as long as you have OpCache setup then at minimum the OpCodes of the class will have been saved from request to request.

AFAIK any solution we have that can share an object between requests will generally involve serializing and deserializing the objects. Which in itself will add a cost of sorts when implemented in an application. This leads me to my current train of thought, wondering what steps of HumanoIDs lifecycle is most expensive in isolation. And secondly, what datasets realistically need to be "hot" for its use to scale.

For instance, if processing from the WordSets array is very fast once read from disk, then caching the disk reads may be most important. However if processing and mapping internally is the most expensive part, then maybe serializing the object to cache that state between requests is most important. On the other hand, does re-using the internal look up structures (across requests) even provide a real benefit at scale? Or can the CPU power behind the device repopulate those via parsing just as fast?

Long story short, this whole thread has given me a lot of ideas on what to play with and test next. I'm working up a branch with some different benchmarks now. :)

mallardduck commented 2 years ago

So expanding on some of what I mentioned above, I've created benchmarks that target (or at least isolate the cost of) a few things at a time. Much of this will produce fairly predictable (in terms of scale) results, but it's nice to have data behind it none the less IMO.

So for instance, I've created a subset of new benchmarks called AlwaysNew and as the name suggests they always create a new HumanoID instance for every benchmark.

Obviously this is much slower than existing benchmarks. However it gives us a baseline control to compare with that is obviously not best-practice, but similar to the vein of thought @caendesilva had with case 2 at the start of this. However these go a step farther and break it down by "cache nothing", "cache file", "cache json". This gives us a scale of how much or how little these pre initialization steps cost.

Here are those results:

+-----------------------+-------------+-----+-------+-----+----------+----------------+--------+
| benchmark             | subject     | set | revs  | its | mem_peak | mode           | rstdev |
+-----------------------+-------------+-----+-------+-----+----------+----------------+--------+
| ReadsFileCreateBench  | benchCreate | 0   | 10000 | 5   | 1.315mb  | 2,456.828ops/s | ±7.74% |
| ReadsFileCreateBench  | benchCreate | 1   | 10000 | 5   | 1.315mb  | 2,454.637ops/s | ±0.80% |
| ReadsFileCreateBench  | benchCreate | 2   | 10000 | 5   | 1.315mb  | 2,327.230ops/s | ±0.82% |
| ReadsFileCreateBench  | benchCreate | 3   | 10000 | 5   | 1.315mb  | 2,236.784ops/s | ±0.57% |
| ReadsFileCreateBench  | benchCreate | 4   | 10000 | 5   | 1.315mb  | 2,160.614ops/s | ±0.47% |
| CachesFileCreateBench | benchCreate | 0   | 10000 | 5   | 1.315mb  | 3,376.522ops/s | ±0.26% |
| CachesFileCreateBench | benchCreate | 1   | 10000 | 5   | 1.315mb  | 3,336.301ops/s | ±1.17% |
| CachesFileCreateBench | benchCreate | 2   | 10000 | 5   | 1.315mb  | 3,218.834ops/s | ±2.49% |
| CachesFileCreateBench | benchCreate | 3   | 10000 | 5   | 1.315mb  | 2,940.643ops/s | ±1.40% |
| CachesFileCreateBench | benchCreate | 4   | 10000 | 5   | 1.315mb  | 2,844.591ops/s | ±1.13% |
| CachesJsonCreateBench | benchCreate | 0   | 10000 | 5   | 1.315mb  | 3,448.155ops/s | ±1.00% |
| CachesJsonCreateBench | benchCreate | 1   | 10000 | 5   | 1.315mb  | 3,481.036ops/s | ±1.60% |
| CachesJsonCreateBench | benchCreate | 2   | 10000 | 5   | 1.315mb  | 3,331.006ops/s | ±1.09% |
| CachesJsonCreateBench | benchCreate | 3   | 10000 | 5   | 1.315mb  | 3,073.246ops/s | ±0.56% |
| CachesJsonCreateBench | benchCreate | 4   | 10000 | 5   | 1.315mb  | 2,930.555ops/s | ±1.00% |
+-----------------------+-------------+-----+-------+-----+----------+----------------+--------+

Now here's the default SpaceID generator results from the same device:

+------------------+-------------+-----+-------+-----+----------+-----------------+--------+
| benchmark        | subject     | set | revs  | its | mem_peak | mode            | rstdev |
+------------------+-------------+-----+-------+-----+----------+-----------------+--------+
| SpaceCreateBench | benchCreate | 0   | 10000 | 5   | 1.315mb  | 95,245.870ops/s | ±1.96% |
| SpaceCreateBench | benchCreate | 1   | 10000 | 5   | 1.315mb  | 94,713.807ops/s | ±7.78% |
| SpaceCreateBench | benchCreate | 2   | 10000 | 5   | 1.315mb  | 32,538.984ops/s | ±9.75% |
| SpaceCreateBench | benchCreate | 3   | 10000 | 5   | 1.315mb  | 16,737.249ops/s | ±6.31% |
| SpaceCreateBench | benchCreate | 4   | 10000 | 5   | 1.315mb  | 12,963.067ops/s | ±1.26% |
+------------------+-------------+-----+-------+-----+----------+-----------------+--------+

All tests run on an M1 Mini

caendesilva commented 2 years ago

Good morning Dan! Thanks for your input on application persistence as that is an area I'm not familiar with!

I also think it's a good idea to break down and find what takes time to process!

Even if we get into micro-optimization territory where things wont have any real effect in production usage, I got into open-source to learn and let me tell you, I'm learning so much, and I'm having really fun! So just wanted to thank you guys!

mallardduck commented 2 years ago

Overall from the results I'm seeing - you can view them in the PR at #12 - the "pre-construction" steps that native PHP does aren't that expensive - or rather caching those alone doesn't improve Ops/sec enough at all.

For instance the baseline, would be:

reading the file,
parsing content to JSON,
construct generator

That benchmark is called ReadsFileCreateBench, then I have one to cache reading the fine, then one that caches reading and JSON. These give us granular results up to the "persist the object itself" area.

Test	Ops/s Max	Ops/s Min
ReadsFileCreateBench	2,495.340	2,030.786
CachesFileCreateBench	3,431.956	2,880.577
CachesJsonCreateBench	3,509.448	2,914.060

So each optimization there does certainly help on a step by step basis when compared to each other. IMO the results would best be described as: "reading file expensive, parsing JSON from already read file (in memory string) less expensive".

Again I did say some of these wouldn't be all too exciting to see, but they are interesting IMO for context. My take away from these parts alone is that if you can cache read from disk, do it - and if you need it as parsed JSON you may as well cache it in that form instead of as a string.

Now all that said we know that regular "sane" workflows would not instantiate a new object every time it's needed. At minimum sane workflows would register an instance as a singleton to their service container. So that would give us the same effective results you both have captured above.

I think that answers a lot of the "basic" optimizations for us which leads us to the more interesting one of "object caching". So if a generator object were "warmed up" (just running a bunch of input thru them) then stored in Redis and loaded when needed, does this speed anything up?

And then finally, my last concern is what type of modeling of workflows would a real applications typical requests require? If it's just used for URL slugs, then that's likely a single look up per request (and potentially many renders for URLs on page).

Is there ever a scenario (in the average cover 80% of use cases sense) where applications would need to look up (I.e parse string -> int) multiple IDs? My gut says that's not an average use-case, but potentially still worth benchmarking to understand.

caendesilva commented 2 years ago

Great breakdown @mallardduck! Thanks!

Is there ever a scenario (in the average cover 80% of use cases sense) where applications would need to look up...

As for this, I can't of the top of my head think of a use case where multiple string IDs need to be parsed to numerical IDs per request.

What I'm currently using it for is to get pretty URL's for projects in a Laravel app.

ID	Name	Project page
1	Corkery, Cronin and Koss	http://openevents.test/projects/backward
2	Mosciski, Schumm and Wiza	http://openevents.test/projects/bode
3	Muller-Towne	http://openevents.test/projects/cigar
4	Murphy, Pfeffer and Harvey	http://openevents.test/projects/hong
5	Lakin, Wiza and Corwin	http://openevents.test/projects/milkyway
6	Ebert Inc	http://openevents.test/projects/pinwheel
7	Ritchie, Sanford and Gislason	http://openevents.test/projects/sombrero
8	Daugherty-Littel	http://openevents.test/projects/tadpole
9	O'Keefe-Leuschke	http://openevents.test/projects/earth-andromeda

When using one of those links, a single reverse lookup is made to find the database primary key from the pretty url.

However, when generating the table, each row uses one integer-to-humanoid conversion. A single page could easily have hundreds of that type of conversion. But not the other way around. I still think it could be fun to benchmark as you said to understand how it all works together.

mallardduck commented 2 years ago

Awesome thanks for your input on that - it helps confirm my assumption on the "80% use-case" as well.

One other observation I made looking over my PRs benchmarks is that the speed has at least a minor causality with ID size.

A graph instead: Screenshot 2022-04-23 151958

Sources

``` benchCreate +-----------------------------+-----+-------+-----+----------+------------------+--------+ | benchmark | set | revs | its | mem_peak | mode | rstdev | +-----------------------------+-----+-------+-----+----------+------------------+--------+ | SpaceBench | 0 | 10000 | 5 | 1.303mb | 250,444.048ops/s | ±0.59% | | SpaceBench | 1 | 10000 | 5 | 1.303mb | 249,263.478ops/s | ±0.63% | | SpaceBench | 2 | 10000 | 5 | 1.303mb | 252,738.053ops/s | ±0.63% | | SpaceBench | 3 | 10000 | 5 | 1.303mb | 170,944.906ops/s | ±0.73% | | SpaceBench | 4 | 10000 | 5 | 1.303mb | 170,732.398ops/s | ±1.09% | | SpaceBench | 5 | 10000 | 5 | 1.303mb | 87,901.966ops/s | ±0.95% | | SpaceBench | 6 | 10000 | 5 | 1.303mb | 86,940.200ops/s | ±1.85% | | SpaceBench | 7 | 10000 | 5 | 1.303mb | 65,796.456ops/s | ±1.09% | | CustomSmallerGeneratorBench | 0 | 10000 | 5 | 1.303mb | 484,181.570ops/s | ±1.79% | | CustomSmallerGeneratorBench | 1 | 10000 | 5 | 1.303mb | 251,111.488ops/s | ±0.96% | | CustomSmallerGeneratorBench | 2 | 10000 | 5 | 1.303mb | 252,060.135ops/s | ±0.60% | | CustomSmallerGeneratorBench | 3 | 10000 | 5 | 1.303mb | 170,539.669ops/s | ±2.76% | | CustomSmallerGeneratorBench | 4 | 10000 | 5 | 1.303mb | 171,025.828ops/s | ±0.68% | | CustomSmallerGeneratorBench | 5 | 10000 | 5 | 1.303mb | 103,315.341ops/s | ±0.76% | | CustomSmallerGeneratorBench | 6 | 10000 | 5 | 1.303mb | 88,018.233ops/s | ±0.20% | | CustomSmallerGeneratorBench | 7 | 10000 | 5 | 1.303mb | 59,290.129ops/s | ±0.57% | | ZooBench | 0 | 10000 | 5 | 3.361mb | 480,600.934ops/s | ±1.17% | | ZooBench | 1 | 10000 | 5 | 3.361mb | 479,483.906ops/s | ±1.12% | | ZooBench | 2 | 10000 | 5 | 3.361mb | 477,642.008ops/s | ±0.79% | | ZooBench | 3 | 10000 | 5 | 3.361mb | 246,663.740ops/s | ±0.92% | | ZooBench | 4 | 10000 | 5 | 3.361mb | 250,284.497ops/s | ±1.32% | | ZooBench | 5 | 10000 | 5 | 3.361mb | 169,102.140ops/s | ±0.49% | | ZooBench | 6 | 10000 | 5 | 3.361mb | 170,171.602ops/s | ±0.68% | | ZooBench | 7 | 10000 | 5 | 3.361mb | 130,059.956ops/s | ±0.80% | +-----------------------------+-----+-------+-----+----------+------------------+--------+ ``` > WSL2, Ubuntu / R9 5900x / PHP version 8.1.5, xdebug ❌, opcache ✔ Each set lines up with a value from this array: ``` [ 9, 42, 100, 420, 1_000, 42_069, 1_000_000, 1_000_000_000 ] ```

For the most part the results here show that larger numbers particularly in the billions will be bit slower to generate. All that seems pretty logical and easy to rationalize - since all generators seem affected equally by that.

However the curious part in the results was finding that the "custom" generator with the smaller static array of wordsets is faster than the space generator with a larger number of words.

caendesilva commented 2 years ago

That's interesting!

For the most part the results here show that larger numbers particularly in the billions will be bit slower to generate. All that seems pretty logical and easy to rationalize - since all generators seem affected equally by that.

I do think we can live with numbers in the billion range are a bit slower as most users won't have billions of records.

However the curious part in the results was finding that the "custom" generator with the smaller static array of wordsets is faster than the space generator with a larger number of words.

Hmm, that's odd, I'd be interested in why what is. (probably) unrelated question: do the lengths if the words matter at all when it comes to performance/speed?

RobThree commented 2 years ago

most users won't have billions of records.

But the ones that do would need the extra performance even more 😉

As I explained here I do like to tinker with this and see what can be done about performance but I also think we shouldn't get hyper-focussed on squeezing out every last CPU cycle for this. After all, it's still PHP. If we'd need bajillions of HumanoID's to be generated or parsed in bulk for whatever reason then we could trivially write this in faster languages (or ones that have application scopes) or just buy more hardware. I think there's barely ever a usecase where we'd generate more than maybe a few hundred HumanoID's in a single pageview. We also should keep in mind that not every usage of HumanoID's will be in web projects at all which may skew requirements in a totally different direction.

However the curious part in the results was finding that the "custom" generator with the smaller static array of wordsets is faster than the space generator with a larger number of words.

I havent seen the actual code (yet) so I can't comment on that (yet). But I'm gonna leave that for another day, I'm having fun and all (I really am) but it's 21:26 over here and a saturday and I think it's time I unwind and have some quality time and have a 🍺 or open a bottle of 🍾 🍷 . And I suggest you gents do the same 😉 Cheers!

caendesilva commented 2 years ago

Take care of yourself Rob! There are no deadlines or time-pressures in open-source and we have gotten a ton of fun stuff done. Have a great evening! Cheers! 🥂

caendesilva commented 2 years ago

@RobThree and @mallardduck Just wanted to let you know I discovered something cool. The Laravel Cache facade actually supports caching of objects!

Not sure how much it helps in practise since it may take some time to put and fetch data from cache depending on the driver, but thought it was fun.

public function constructHydeInstance(): void
{
    $this->hyde = Cache::get(Hyde::class) ?? new Hyde();
}

public function persist(): void
{
    Cache::store('file')->put(Hyde::class, $this->hyde);
}

Just tried it out, but so far works surprisingly great to persist a shared object between requests, since PHP otherwise appears to be stateless.

mallardduck commented 2 years ago

@caendesilva - yeah, that's potentially an option we could take in the laravel library I've spun up. Ultimately it'd be worth testing out more with benchmarks in that repo to see what provides the best results.

AFAIK this will just be using serialize and then saving it to w/e the cache store Laravel has configured is. So with that in mind it will be worth testing benchmarks with multiple cache store backends. Better performance with Redis is hopefully expected, however I might not expect better performance if using the File cache option.

mallardduck commented 9 months ago

so I finally had some free time and a neat idea to expand on the DX of this package. had some ideas around the word dictionary system and how it could be more like it's inspiration. which allows for reuse of the dictionary components for users to define custom ones.

the api I was inpsired by is here - interestingly the project's has a new name andreasonny83/unique-names-generator, think the last time i looked at it for inspiration it was a-type/adjective-adjective-animal.

WIP PR here: https://github.com/RobThree/HumanoID/pull/16

RobThree commented 9 months ago

Hey @mallardduck, long time no see! I'll take a look tomorrow, but it sounds cool!

RobThree / HumanoID

Benchmarks & How to properly bind a Singleton in the Laravel service container #10

Benchmarks using Laravel 9, PHP 8, (16 cores @ 3.60GHz)

About:

Test 1: Initializing generator outside the loop

Test 2: Initializing the generator in each loop

Test 3: Initializing the generator as a singleton in the service container

Test 4: Initializing the generator in the service container by binding the generator method

Conclusion

About these results

Conclusion

What's next

Offtopic:

Macbook M1 2020 / PHP 8.1.5 / xdebug ✔, opcache ❌:

Macbook M1 2020 / PHP 8.1.5 / xdebug ❌, opcache ❌:

Macbook M1 2020 / PHP 8.1.5 / xdebug ❌, opcache ✔:

Windows i9-10900X @3.7GHz / PHP 8.1.5 / xdebug ❌, opcache ❌:

WSL2 Debian on Win / i9-10900X @3.7GHz / PHP 7.4.29 / xdebug ❌, opcache ❌:

WSL2 Debian on Win / i9-10900X @3.7GHz / PHP 8.1.5 / xdebug ✔, opcache ❌:

WSL2 Debian on Win / i9-10900X @3.7GHz / PHP 8.1.5 / xdebug ❌, opcache ❌:

WSL2 Debian on Win / i9-10900X @3.7GHz / PHP 8.1.5 / xdebug ❌, opcache ✔:

Debian VM / Xeon E3-1226v2 @3.3Ghz / PHP 8.1.5 / xdebug ✔, opcache ✔

Debian VM / Xeon E3-1226v2 @3.3Ghz / PHP 8.1.5 / xdebug ❌, opcache ✔

Win10 Pro AMD Ryzen 7 @3.6GHz / PHP 8.0.13 xdebug ✔, opcache ❌

WSL2 on AMD Ryzen 7 @3.6GHz / PHP 8.0.14, xdebug ❌, opcache ❌

phpbench formatting

Application persistence