dotnet / linker

388 stars 127 forks source link

Tracking values in MarkStep for processing later leads to poor performance #3069

Closed jtschuster closed 1 year ago

jtschuster commented 1 year ago

https://github.com/dotnet/linker/issues/3067 lead to me discovering that we spend too much time in the linker repeating checks when the conditions may not have changed. For example, we track virtual methods in _virtual_methods and each iteration of the process pipeline, we check every override of every virtual method to see if something changed such that it should be marked.

We could modify the behavior to check the override method only when one of its conditions has changed. For example, when a type is marked, we check all virtual methods on the type for the other criteria for keeping a virtual method (e.g., the base method is marked). Then if a virtual method is marked, we'd also check all the overriding methods to see if their types are instantiated. If so, we mark them right away, if not, they will get marked when the declaring type is marked as instantiated.

It seems like this may have been the original way MarkStep worked but got modified over time by people who didn't know about the convention (me).

_typesWithInterfaces and_interfaceOverrides also can follow this logic to improve performance.

vitek-karas commented 1 year ago

Another option would be to do a somewhat bigger change (but ultimately very beneficial):

There would need to be some investigation done if we can use the dependency analysis in this "partial mode" and if it can be effective in terms of memory (for example we would probably want to avoid having a node for each marked method, but maybe it's not such a big deal).

The advantages:

vitek-karas commented 1 year ago

That said if all we care about is some perf improvements, I agree that breaking the delayed storage in mark step into the right "groups" should fix most of the problem.