JIT: possibly revisit how we set assertion prop table size

AndyAyersMS commented 6 years ago

Some data on how often we drop assertions (from investigations into dotnet/runtime#10461, via pmi over fx):

Table Size	Num Methods	Avg Assertion Count	Methods with Dropped Assertions
64	159493	4.6	222
128	1801	42.3	28
256	237	81.2	4

Overall this doesn't seem too bad -- only about 0.16% of methods lost assertions from table size limits. I unfortunately didn't track how many of the size 64 cases were ones where we deliberately limited the size for throughput reasons.

Table size is determined by looking at compILCodeSize which will not take additions from inlining into account (among other things). So it seems like we perhaps could find a better size predictor.

Or perhaps just allow the table to grow?

Here's a breakdown of how many methods would fit into our existing size bins:

Assertion Count	Methods
0 - 64	161835
65 - 128	585
129 - 256	104
more than 256	21

Max number of assertions seen was 615 (in a method that used a size 64 table), in Microsoft.CodeAnalysis.VisualBasic.VisualBasicCommandLineParser:Parse(ref,ref,ref,ref):ref:this. So 551 of these were dropped.

There is obviously some consideration of the overall cost of the subsequent dataflow analysis too, which is roughly proportional to max assertion size * number of blocks. So if we allow max assertion size to scale up with blocks we potentially see some quadratic behavior. But this seems perhaps solvable too -- we generate all (or more) assertions initially, and then only limit the ones we use if we see bad scaling.

This might let us also try and prioritize which assertions should survive, though I don't have any particular idea yet how to formulate that (something like the value number equivalent of weighted ref count usage ...?).

Note today we actually generate all assertions. Assertion gen doesn't early out once the table is full (though perhaps it should). It just stops recording assertions once the table fills up. So we are already paying the cost of finding all these assertions. And the table size itself is not a burden

So a rough proposal would be to find and record all assertions -- allowing the table to grow larger (with perhaps some size limit) and have a budget based on blocks * assertions, and, if over budget, start trimming assertions somehow to get under budget.

category:implementation theme:assertion-prop skill-level:intermediate cost:medium impact:small

mikedn commented 6 years ago

Should the number of assertions that are actually used be looked at as well? I suppose it's quite possible that we generate more assertions than needed.

FWIW I started a RangeCheck prototype that does not existing assertions but rather extracts them on demand. I need to go back to it to see what it would take to bring it to a reasonable functionality level to allow some measurements.

AndyAyersMS commented 6 years ago

Started to look at how many assertions are used with some ad-hoc instrumentation [raw data]. Here's a summary plot. X axis is the number of assertions created; Y is the number used. Box-and whiskers show distribution of Y for various X's. Red lines give the 0.5, 0,90, and 0.95 linear quantile estimates.

Upshot is that typically about 1 in 10 assertions gets used; 95%ile usage is about 1 in 5. There are some outliers but overall of corelib the maximum number used was 22 (out of 161).

1551 of the 28649 methods used all their assertions, but none of those cases generated more than 5 assertions.

So it would appear most assertions are not useful (at least to assertion prop).

erozenfeld commented 6 years ago

@dotnet/jit-contrib

kunalspathak commented 1 year ago

dotnet / runtime

JIT: possibly revisit how we set assertion prop table size #10591