DataObjects-NET / dataobjects-net

https://dataobjects.net
MIT License
60 stars 23 forks source link

Optimize IndexInfo.Columns to be Array instead of List<> #300

Closed SergeiPavlov closed 11 months ago

SergeiPavlov commented 1 year ago

If is immutable collection. No need to be List

This can save Megabytes of RAM: image

alex-kulakov commented 11 months ago

If we save some memory by not having lists for read-only collections than maybe we should also optimize IndexInfo.UpdateState()

First what I would do is check for length of lazy list. Having fields declared lazy is rare, majority of entities will have no such fields. Based on our test models. System and regular lists will contain items for sure. System columns will contain key columns + system columns, for now I see only one case when column is system - when column is TypeId. I assume system columns capacity in real life is from 2 to maybe 4 (1 TypeId column + 1-3 key fields), there could be some extreme cases with composite PK put chances are low.

Regular columns in worst scenario will have (columns.Count - 1) items, -1 is for Key column because it is obligatory, TypeId column is not always represented. To be safe we could declare capacity of regular columns list equal columns.Count.

I did four versions of ColumnIndexMap creation algorithm and benchmarked them on various combinations of 100 columns collection

Version 1 - with .ToArray() -

      var system = new List<int>();
      var lazy = new List<int>();
      var regular = new List<int>(columns.Count);

      for (int i = 0; i < columns.Count; i++) {
        var item = columns[i];
        if (item.IsPrimaryKey || item.IsSystem)
          system.Add(i);
        else {
          if (item.IsLazyLoad)
            lazy.Add(i);
          else
            regular.Add(i);
        }
      }

      ColumnIndexMap = new ColumnIndexMap(system.ToArray(), regular.ToArray(), lazy.ToArray());

Version 2 - previous version with empty array usage

      var system = new List<int>();
      var lazy = new List<int>();
      var regular = new List<int>(columns.Count);

      for (int i = 0; i < columns.Count; i++) {
        var item = columns[i];
        if (item.IsPrimaryKey || item.IsSystem)
          system.Add(i);
        else {
          if (item.IsLazyLoad)
            lazy.Add(i);
          else
            regular.Add(i);
        }
      }

      ColumnIndexMap = new ColumnIndexMap(
        (system.Count == 0) ? Array.Empty<int>() : system.ToArray(),
        (regular.Count == 0) ? Array.Empty<int>() : regular.ToArray(),
        (lazy.Count == 0) ? Array.Empty<int>() : lazy.ToArray());

Version 3 - a bit tricky, it uses array for regular columns and segment of array to cut off unfilled part of original array :)

      var system = new List<int>();
      var lazy = new List<int>();
      var regular = new int[columns.Count];

      int regularIndex = 0;

      for (int i = 0; i < columns.Count; i++) {
        var item = columns[i];
        if (item.IsPrimaryKey || item.IsSystem)
          system.Add(i);
        else {
          if (item.IsLazyLoad)
            lazy.Add(i);
          else {
            regular[regularIndex] = i;
            regularIndex++;
          }
        }
      }

      return new ColumnIndexMap(
        (system.Count == 0) ? Array.Empty<int>() : system.ToArray(),
        (regularIndex == 0) ? Array.Empty<int>() : new ArraySegment<int>(regular, 0, regularIndex - 1),
        (lazy.Count == 0) ? Array.Empty<int>() : lazy.ToArray());

Most realistic scenarios show some benefits for all options above. What do you think?

Method Categories Mean Error StdDev Ratio RatioSD Gen 0 Allocated
Current05 OneKeyNoSystemOrLazy 1,039.9 ns 4.13 ns 3.87 ns baseline 0.2728 1,288 B
ArraysBasedOnLists05 OneKeyNoSystemOrLazy 958.7 ns 2.99 ns 2.65 ns -8% 0.4% 0.2155 1,016 B
EmptyArrayOptimization05 OneKeyNoSystemOrLazy 876.1 ns 3.39 ns 3.17 ns -16% 0.5% 0.2155 1,016 B
EmptyArrayPlusSegment05 OneKeyNoSystemOrLazy 803.6 ns 1.28 ns 1.14 ns -23% 0.4% 0.1249 592 B
Current06 TwoKeyNoSystemOrLazy 1,002.1 ns 5.84 ns 5.46 ns baseline 0.2728 1,288 B
ArraysBasedOnLists06 TwoKeyNoSystemOrLazy 915.5 ns 2.41 ns 2.25 ns -9% 0.4% 0.2136 1,008 B
EmptyArrayOptimization06 TwoKeyNoSystemOrLazy 954.0 ns 11.67 ns 10.91 ns -5% 1.3% 0.2136 1,008 B
EmptyArrayPlusSegment06 TwoKeyNoSystemOrLazy 780.1 ns 1.97 ns 1.74 ns -22% 0.6% 0.1249 592 B
Current07 ThreeKeyNoSystemOrLazy 998.7 ns 10.81 ns 10.11 ns baseline 0.2728 1,288 B
ArraysBasedOnLists07 ThreeKeyNoSystemOrLazy 940.3 ns 6.94 ns 6.49 ns -6% 1.3% 0.2155 1,016 B
EmptyArrayOptimization07 ThreeKeyNoSystemOrLazy 884.7 ns 11.01 ns 10.30 ns -11% 1.7% 0.2155 1,016 B
EmptyArrayPlusSegment07 ThreeKeyNoSystemOrLazy 760.3 ns 1.81 ns 1.61 ns -24% 0.9% 0.1268 600 B
Current08 FourKeyNoSystemOrLazy 1,010.9 ns 6.32 ns 5.28 ns baseline 0.2728 1,288 B
ArraysBasedOnLists08 FourKeyNoSystemOrLazy 909.9 ns 2.58 ns 2.42 ns -10% 0.4% 0.2136 1,008 B
EmptyArrayOptimization08 FourKeyNoSystemOrLazy 929.8 ns 10.29 ns 9.62 ns -8% 0.8% 0.2136 1,008 B
EmptyArrayPlusSegment08 FourKeyNoSystemOrLazy 779.8 ns 1.24 ns 1.03 ns -23% 0.5% 0.1268 600 B
Current09 OneKeyOneSystemNoLazy 978.1 ns 3.76 ns 3.33 ns baseline 0.2728 1,288 B
ArraysBasedOnLists09 OneKeyOneSystemNoLazy 914.4 ns 3.40 ns 3.18 ns -6% 0.4% 0.2136 1,008 B
EmptyArrayOptimization09 OneKeyOneSystemNoLazy 951.0 ns 4.93 ns 4.37 ns -3% 0.6% 0.2136 1,008 B
EmptyArrayPlusSegment09 OneKeyOneSystemNoLazy 801.0 ns 2.17 ns 1.93 ns -18% 0.4% 0.1249 592 B
Current10 TwoKeyOneSystemNoLazy 979.8 ns 1.88 ns 1.67 ns baseline 0.2728 1,288 B
ArraysBasedOnLists10 TwoKeyOneSystemNoLazy 904.4 ns 3.37 ns 3.15 ns -8% 0.4% 0.2155 1,016 B
EmptyArrayOptimization10 TwoKeyOneSystemNoLazy 943.4 ns 1.49 ns 1.40 ns -4% 0.2% 0.2155 1,016 B
EmptyArrayPlusSegment10 TwoKeyOneSystemNoLazy 786.0 ns 6.58 ns 5.83 ns -20% 0.8% 0.1268 600 B
Current11 ThreeKeyOneSystemNoLazy 977.1 ns 2.97 ns 2.63 ns baseline 0.2728 1,288 B
ArraysBasedOnLists11 ThreeKeyOneSystemNoLazy 922.3 ns 2.20 ns 2.05 ns -6% 0.4% 0.2136 1,008 B
EmptyArrayOptimization11 ThreeKeyOneSystemNoLazy 883.8 ns 3.91 ns 3.26 ns -10% 0.5% 0.2136 1,008 B
EmptyArrayPlusSegment11 ThreeKeyOneSystemNoLazy 781.7 ns 3.28 ns 3.06 ns -20% 0.5% 0.1268 600 B
Current12 FourKeyOneSystemNoLazy 1,012.2 ns 6.85 ns 6.07 ns baseline 0.2842 1,344 B
ArraysBasedOnLists12 FourKeyOneSystemNoLazy 922.3 ns 3.33 ns 3.11 ns -9% 0.6% 0.2270 1,072 B
EmptyArrayOptimization12 FourKeyOneSystemNoLazy 899.8 ns 3.03 ns 2.83 ns -11% 0.7% 0.2270 1,072 B
EmptyArrayPlusSegment12 FourKeyOneSystemNoLazy 800.4 ns 1.48 ns 1.38 ns -21% 0.6% 0.1402 664 B
Current13 OneKeyOneSystemOneLazy 978.9 ns 2.47 ns 2.19 ns baseline 0.2804 1,328 B
ArraysBasedOnLists13 OneKeyOneSystemOneLazy 954.3 ns 1.41 ns 1.25 ns -3% 0.3% 0.2289 1,080 B
EmptyArrayOptimization13 OneKeyOneSystemOneLazy 980.5 ns 3.79 ns 3.36 ns +0% 0.4% 0.2289 1,080 B
EmptyArrayPlusSegment13 OneKeyOneSystemOneLazy 808.2 ns 1.82 ns 1.61 ns -17% 0.3% 0.1402 664 B
Current14 OneKeyOneSystemTwoLazy 1,009.4 ns 4.36 ns 4.08 ns baseline 0.2804 1,328 B
ArraysBasedOnLists14 OneKeyOneSystemTwoLazy 970.2 ns 2.62 ns 2.32 ns -4% 0.5% 0.2270 1,072 B
EmptyArrayOptimization14 OneKeyOneSystemTwoLazy 977.4 ns 10.60 ns 9.92 ns -3% 1.1% 0.2270 1,072 B
EmptyArrayPlusSegment14 OneKeyOneSystemTwoLazy 846.6 ns 1.92 ns 1.70 ns -16% 0.4% 0.1402 664 B
Current15 OneKeyOneSystemThreeLazy 1,001.0 ns 2.63 ns 2.33 ns baseline 0.2804 1,328 B
ArraysBasedOnLists15 OneKeyOneSystemThreeLazy 910.5 ns 3.34 ns 3.12 ns -9% 0.4% 0.2289 1,080 B
EmptyArrayOptimization15 OneKeyOneSystemThreeLazy 981.6 ns 3.09 ns 2.74 ns -2% 0.4% 0.2289 1,080 B
EmptyArrayPlusSegment15 OneKeyOneSystemThreeLazy 844.8 ns 2.90 ns 2.72 ns -16% 0.4% 0.1421 672 B
Current16 OneKeyOneSystemFourLazy 1,015.0 ns 2.92 ns 2.59 ns baseline 0.2804 1,328 B
ArraysBasedOnLists16 OneKeyOneSystemFourLazy 956.8 ns 2.79 ns 2.47 ns -6% 0.3% 0.2270 1,072 B
EmptyArrayOptimization16 OneKeyOneSystemFourLazy 990.2 ns 7.24 ns 6.42 ns -2% 0.7% 0.2270 1,072 B
EmptyArrayPlusSegment16 OneKeyOneSystemFourLazy 832.0 ns 1.68 ns 1.49 ns -18% 0.4% 0.1421 672 B
alex-kulakov commented 11 months ago

I can do it by myself, I need only your opinion.

SergeiPavlov commented 11 months ago

The performance looks nearly the same As for me, it is better to choose method with minimal heap overhead