Open Manishearth opened 6 years ago
I have two theories/ideas for heuristics (without investigating anything first):
memcpy
would take to copy one chunk, of whatever best-case-scenario size it can support - and that best-case might be related to platform SIMD sizescc @rkruppe
One way of tackling this might be:
memcpy
s of the result of a lookup in the relevant arraySince the lookup happening is deterministic, and the indices (and thus arrays) are likely to be small for hugely-heterogeneous enums, the arrays are both 1.) easily prefetched by the CPU and 2.) likely to fit in a cacheline. As a result, most cases may avoid having any observable latency penalty unless the memory bus is saturated.
EDIT: Also, this pollutes the D$ but not the I$ or the branch predictor, and the D$ is going to be prodded by the memcpy anyway.
Triage: I'm not aware of any changes here
This should probably be written as a MIR optimization. The mentioned array can be created as a const directly in the mir (mentioned via Rvalue::Use(Operand::Const(...))
). If we add a query that takes a Ty<'tcx>
and emits said constant, we'll even get deduplication for free.
For types like
SmallVec<[T; 1000]>
, or in general an enum where the variants have a huge difference in size, we should probably try to optimize the copies better.Basically, for enums with some large-enough difference between variant sizes, we should use a branch when codegenning copies/moves.
I'm not sure how common this pattern is, but it's worth looking into!
cc @rust-lang/wg-codegen