perf: make model creation faster

coolsoftwaretyler commented 8 months ago

What does this PR do and why?

In the reduce function inside toPropertiesObject, we've been creating a new object and merging it with props for every iteration. This might be a bottleneck if props has a lot of keys. Instead, we can directly assign the new property to props and avoid the object creation and merging.

We've also been using an object to check for duplicate keys. This requires converting the keys to strings. Using a Set should avoid that issue, and IMO is a little nicer to read.

I believe this improves the speed of model creation above the margin of error of my benchmarks. I used this performance testing tool to check the performance of MST 5.3.0 and this changeset running in node. I haven't yet checked in web or bun. Here's a few relevant test results:

5.3.0

scenario,ops_per_sec,margin_of_error,runs,max_memory_used_kb
Create 1 model,16255.43556299085,2.0149614245219034,88,280432
Create 10 models,1726.2679949268702,2.1457973152575627,88,326160
Create 100 models,180.8861705629388,1.5603275496951987,84,3003376
Create 1,000 models,15.839075385738038,2.633378708118982,44,12550480
Create 10,000 models,1.5986554744732953,3.055960171640972,8,12361216
Create 100,000 models,0.16010535102193169,7.319537180777344,5,7737616

This branch

scenario,ops_per_sec,margin_of_error,runs,max_memory_used_kb
Create 1 model,16129.00654121993,2.5076336085120166,84,290096
Create 10 models,1859.7293521665006,1.8005317075623681,93,345688
Create 100 models,188.2793706606448,1.057293962845895,88,2921632
Create 1,000 models,18.71691771032261,1.1157484919305993,51,11970936
Create 10,000 models,1.8402872820948037,2.0327187905280706,9,7199456
Create 100,000 models,0.18820209480049743,0.6963479413567704,5,2630904

Comparison

One model at a time: there's no real difference here, the operations per second seem to be within the margin of error, but this makes sense to me. benchmark.js does something under the hood to prevent the engine from making optimizations, so individual model creation never gets optimized test-to-test.
10 models at a time: sees a 7.7% improvement in ops/sec, which is above the margin of error in both cases
100 models at a time: sees a 4% improvement in ops/sec, also above the margin of error in both cases.
1,000 models at a time: 18% improvement. I think this may be inflated because the raw number is smaller than the 1/10/100 case. There are fewer runs. As a counterpoint: it may actually be the case that this change shows up in a bigger way as you create more models. I'm not certain either way, and not sure how to check one way or another yet.
10,000 models at a time: 15% improvement. Since this is similar to the 1k scenario, I'm hopeful that my hypothesis in 1k is true. At the very least, some kind of consistent behavior happens at this scale (either a consistent false positive, or a consistent improvement).
100,000 models at a time: 17.5% improvement. Again, this is either a consistent fluke or a promising trend that holds in these higher numbers.

Steps to validate locally

If you don't already have it, go clone mst-performance-testing on your machine and run the tests. You'll get some local CSVs in the results/ directory.
Pull this branch to your machine in a separate repository. Run npm run build, then npm link in mobx-state-tree.
Go to the mst-performance-testing directory and run npm link mobx-state-tree.
Run the tests again. They'll be timestamped, but will probably label themselves as 5.3.0 as well because of the way npm linking happens.
Compare the results. The performance library runs on your own hardware, so any number of environmental factors could influence these results, I think. I'd love to have additional verification.
If possible, link this branch into any projects you have that run MST and see if other metrics improve as well.

Minor note

I changed around some comments with regard to personal preference. I do not feel strongly about those changes.

coolsoftwaretyler commented 8 months ago

FWIW, I don't really trust the memory usage measurement, so I wouldn't read too much into that, but it's nice that they are overall trending downwards. I would be curious to know if the trendline looks the same for other folks.

coolsoftwaretyler commented 8 months ago

I also published a release candidate tag for mobx-state-tree@5.3.1-alpha.1 if anyone out there wants to drop this into their current codebase and see how it goes: https://www.npmjs.com/package/mobx-state-tree/v/5.3.1-alpha.1

coolsoftwaretyler commented 8 months ago

This is not very rigorously measured, but in my production app (React Native using Hermes) we load about 8,000 MST objects into memory. In the past, this has taken ~3-4 seconds (we do some other operations on it that are expensive during a loading sequence).

With this change, I'm seeing 0.8-1.2 seconds.

We've made some other changes since I last measured it, so I don't think that's all because of this PR, but I'm starting to feel good about these improvements being real.

coolsoftwaretyler commented 8 months ago

Since we've had a little review, talked about it in Slack, and have a release candidate out for this, I'm going to merge this into master now and see if we can get some more real world usage data with the 5.3.1-alpha.1 release

coolsoftwaretyler commented 8 months ago

I was also messing around with VS Code debugger and fixed a path that is no longer correct.

chakravarthi-bb commented 8 months ago

this is nice, sorry missed the PR

mobxjs / mobx-state-tree