TurnerSoftware / MongoFramework

An "Entity Framework"-like interface for MongoDB
MIT License
392 stars 35 forks source link

Improve change tracking performance #369

Open Turnerj opened 1 year ago

Turnerj commented 1 year ago

Fixes #362

When adding a very large number of entities to the change tracker, it was performing an extreme number of calls to GetValue on the PropertyInfo. This isn't typically slow but is very slow when call as often as it was.

This PR contains two primary changes to improve the performance of the existing code


Up to 98% faster for setting the entity state.

Before

Method EntryCount Mean Error StdDev Allocated
SetEntityState 100 352.7 us 6.66 us 6.84 us 2 B
SetEntityState 1000 32,571.1 us 161.22 us 134.62 us 30 B
SetEntityState 10000 3,402,924.9 us 45,554.20 us 40,382.61 us 3656 B

After

Method EntryCount Mean Error StdDev Allocated
SetEntityState 100 22.62 us 0.450 us 0.645 us -
SetEntityState 1000 911.66 us 17.737 us 18.979 us 1 B
SetEntityState 10000 82,002.73 us 1,611.556 us 3,104.921 us 69 B
Turnerj commented 1 year ago

There are further improvements possible based on the scenario in #362 as a big problem with the implementation when setting multiple entities is that the number of checks via GetEntry grows with every extra entity.

That is to say, you add 10 items to an empty entity container:

We can avoid a bunch of checks by not checking the items we're adding against themselves while we're adding them. So if someone were to add 500,000 items, we don't actually need to check them against each other.

That said, without doing the checks, it does mean it is possible for someone to intentionally double up the exact same reference entity in the database (eg. the enumerable has the same entity twice). This could be avoided by chucking the whole thing in a HashSet perhaps but that will cause a large amount of allocations, especially for the case with 500,000 entities being added.

Turnerj commented 1 year ago

I would like to work out a good/better solution for adding many entries at once without a large number of allocations while addressing the underlying performance problem before merging this fix.