Open michaelwoerister opened 8 years ago
For -msvc
, just make each constant a COMDAT and link
's /OPT:REF,ICF
will remove unused constants and deduplicate them as well.
This would impact compile times as well, I believe; so perhaps I-compiletime?
In fact, if I'm interpreting correctly, this means that all x[foo]
and x + 10
expressions cause an increase both final executable/library size as well as more code for LLVM to chew on.
It seems that we are already checking whether there is an existing string constant with some given contents, so file names and error message text are not duplicated. Just the constant struct containing pointers to filename and error message and the line number. This could be optimized by generating pairs of (error message, file name) so that the thing that is duplicated only contains one pointer instead of two. We could also store everything in a big array and store 32 bit indices instead of full pointers to further reduce size.
This may be worth investigating for code size improvements.
Cc @rust-lang/wg-codegen
Some kinds of expression can generate a runtime panic with a compiler generated error message containing the source location of that expression. Examples are arithmetic expressions that can cause integer overflow or division by zero, and expressions that result in array bounds checks.
So far we are allocating string constants in the same codegen unit as the expression, which has two disadvantages:
Because we do not check whether there is already a constant for that source location, we will have copies of the same data for each monomorphized instance of a function. (Or does LLVM merge equal constants if there address is never taken?)
Since the source location is contained in machine code, the whole object file often has to be re-compiled during incremental compilation even if just formatting has changed or comments have been added.
This could be solved by interning all those constants into a separate object file with a (semi-)stable symbol name (e.g. symbol_name = function_symbol_name + index within function). That way, only the error-message-object-file has to be regenerated when nothing but formatting has changed.