dubiousconst282 / DistIL

Post-build IL optimizer and intermediate representation for .NET programs
MIT License
116 stars 1 forks source link

Pre-resize lists with Add() calls inside loops with known bound #31

Open dubiousconst282 opened 9 months ago

dubiousconst282 commented 9 months ago

Lists are often instantiated and immediately filled with items from a source having a known length, and it isn't always convenient to explicitly initialize the list with a capacity:

var list = new List<TypeDesc>();

foreach (var itfHandle in typeInfo.GetInterfaceImplementations()) { // `C : IReadOnlyCollection<T>`
    // ...
    list.Add(...);
}
return list;

This could be rewritten to a constructor call with the known capacity, or to a later call to EnsureCapacity(). The first is only possible if there aren't any side effects in between the ctor and source def.

This can be naturally generalized to lists that aren't freshly created. must make sure that calls to EnsureCapacity() always double the list's capacity to avoid +1 resizes that lead to O(n^2) complexity. (This is guaranteed by list.EnsureCapacity(list.Count + addCount))


There's another case where the items are added inline, and an upper-bound could be determined by counting the number of Add() calls instead of a loop bound.

var list = new List<Item>();

if (condA) list.Add(itemA);
if (condB) list.Add(itemB);
if (condC) list.Add(itemC);
// ...

return list;

Note that Add() calls behind if conditions are problematic because we don't know anything about them, so I'm not sure it would be a good idea to presize based on a static probability factor as that could end up pessimizing the code by allocating extra memory (but could still be worth for non-escaping lists with an ToArray() call at the end?).

Worklist

neon-sunset commented 9 months ago

If the bound is known, the list can be pre-sized and then instead of .Add() it could be written as

var span = CollectionsMarshal.AsSpan(list);
/* for ... */
CollectionsMarshal.SetCount(list, count);

(in .NET 8)