dotnet / msbuild

The Microsoft Build Engine (MSBuild) is the build platform for .NET and Visual Studio.
https://docs.microsoft.com/visualstudio/msbuild/msbuild
MIT License
5.2k stars 1.35k forks source link

consider removing verifythrow(..) strings #8991

Open danmoseley opened 1 year ago

danmoseley commented 1 year ago

These are un-localized strings that appear in the InternalErrorException message if we hit a bug in ourselves. Roughly across all the assemblies, they come to about 50-60KB even deduplicated:

https://gist.github.com/danmoseley/d29a91c87fa8617de4f2963c5160b236

The message is rarely if ever useful to anyone that isn't a developer in this repo.

Suggestion: replace these messages with a unique integer. Then instead of say

This is an unhandled exception in MSBuild -- PLEASE UPVOTE AN EXISTING ISSUE OR FILE A NEW ONE AT https://aka.ms/msbuild/unhandled.
MSB0001: Internal MSBuild Error: Assuming 1-to-1 mapping between configs and results. Otherwise it means the caches are either not minimal or incomplete

which means nothing to customers, they would get eg

This is an unhandled exception in MSBuild -- PLEASE UPVOTE AN EXISTING ISSUE OR FILE A NEW ONE AT https://aka.ms/msbuild/unhandled.
MSB0001: Internal MSBuild error with code MSB0001-1234

... which is equally stable and if anything is MORE searchable.

Assuming the deprecated engine bits aren't touched, this would save about 30-40KB on disk. Equally valuable, it would make it clearer which strings are resource strings. Right now half of the VerifyThrow's are passed resource names and half are passed these unlocalized messages and we've repeatedly mixed them up. If we remove the latter, things become clearer: everything is a resource string.

KalleOlaviNiemitalo commented 1 year ago

How can the uniqueness of the integers be tested? A Roslyn analyzer would not be able to detect conflicts between separate assemblies. An enum type would consume space for the names although perhaps less than the strings. Would it be possible to postprocess the assemblies to eliminate the fields of the enum and keep just the type?

danmoseley commented 1 year ago

A simple Python script run over the sources. Plus a baseline file that includes any "burned" ones (already shipped but later removed)

KalleOlaviNiemitalo commented 1 year ago

ConstantExpectedAttribute from https://github.com/dotnet/runtime/issues/33771 might be useful for flagging calls in which the integer is not in a format that the script would recognize.

rokonec commented 1 year ago

MSBuild team triage: We are not sure about impact here. In theory opt-prof should optimize it, and usually do not load these pages into memory. We do not consider 40KB disk space big enough comparing to cost of implementation and maintenance.

danmoseley commented 1 year ago

fair enough. some more ideas