Tracking Issue for `large_assignments` lint

oli-obk commented 3 years ago

This is a tracking issue for the implementation of the MCP https://github.com/rust-lang/compiler-team/issues/420 The feature gate for the issue is #![feature(large_assignments)].

About tracking issues

Tracking issues are used to record the overall progress of implementation. They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature. Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

[x] Implement the RFC (https://github.com/rust-lang/rust/pull/83519)
[x] make sure we inform the user properly about the fact that they can change the limit https://github.com/rust-lang/rust/labels/E-easy
- this just needs an additional note on the diagnostic informing the user about the current limit and how to change it.
[x] reduce the noisiness (it's reported multiple times on different parts of an expression, but should only trigger on the innermost/first one) https://github.com/rust-lang/rust/labels/E-easy
- this can be done by adding the current span to a Vec on self in https://github.com/oli-obk/rust/blob/85b1c67910dbeede8aabb16bc0c1c09a2fefe8ab/compiler/rustc_mir/src/monomorphize/collector.rs#L784
- and then, right before https://github.com/oli-obk/rust/blob/85b1c67910dbeede8aabb16bc0c1c09a2fefe8ab/compiler/rustc_mir/src/monomorphize/collector.rs#L760 checking if the current span overlaps with any span in the vec.
[x] stop linting in box/arc initialization https://github.com/rust-lang/rust/labels/E-medium
[x] make span point to specific arg when moving large args into functions (https://github.com/rust-lang/rust/pull/116520)
[x] Differentiate between Operand::Move and Operand::Copy, see https://github.com/rust-lang/rust/pull/116310#discussion_r1349506531 and https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/arg.20Spans.20for.20TerminatorKind.3A.3ACall.3F/near/393865180
- https://github.com/rust-lang/rust/pull/120773
[ ] Make sure diagnostics point to correct span when -Zinline-mir is involed. See https://github.com/rust-lang/rust/issues/121672
[ ] fix lint duplication with multiple generic instantiations
[ ] Suggest borrow instead of #[allow(...)] if possible
[ ] enable the lint by default with a 4k threshold (https://github.com/rust-lang/rust/pull/115684)
[ ] disable the lint in tests (by default, users can still turn it on) https://github.com/rust-lang/rust/labels/E-medium
[ ] move lint logic out of compiler/rustc_monomorphize/src/collector.rs (we should do this after we enable the lint by default since doing this has a high risk of regressing performance)
[ ] Adjust documentation (see instructions on rustc-dev-guide)
[ ] Stabilization PR (see instructions on rustc-dev-guide)

Unresolved Questions

Should this threshold be configurable via the command line?
Should we split lint into multiple size limits lints?

Implementation history

https://github.com/rust-lang/rust/pull/83519

leonardo-m commented 3 years ago

This seems an useful lint. But is this lint firing when there's a copy larger than X bytes in the source code, or when the compiler optimization passes actually leave a X bytes copy in the binary? In the second is much more useful because modern compilers are supposed to optimize away many copies (see RVO, etc).

osa1 commented 3 years ago

Has anyone started working on the implementation yet?

oli-obk commented 3 years ago

The implementation is in #83519

oli-obk commented 3 years ago

or when the compiler optimization passes actually leave a X bytes copy in the binary? In the second is much more useful because modern compilers are supposed to optimize away many copies (see RVO, etc).

It is indeed the second part. We do this doing collection, so after monomorphization, but before LLVM has a chance to do anything. Though I don't believe LLVM does anything useful here.

joshtriplett commented 2 years ago

This came up in today's @rust-lang/lang meeting. One notable consideration: should we enable this by default, and with what threshold?

oli-obk commented 2 years ago

We don't really have a precedent for heuristic lints. The closest equivalent are the other raisable limits. I don't think we can use crater to find good limits, as most large types will be occuring in binary crates or tests on edge cases.

So... I think we should pick an arbitrary limit (e.g. 4kb) and only enable it by default if cfg(test) is not set and just wait for reports to come in. It's easily circumvented (just change the limit)

So: things to do before enabling:

we should make sure we inform the user properly about the fact that they can change the limit
we should disable the lint in tests (by default, users can still turn it on)
should look into reducing the noisiness (it's reported multiple times on different parts of an expression, but should only trigger on the innermost/first one)

joshtriplett commented 2 years ago

Sounds reasonable to me. And yes, I think the most obvious thresholds would be something like 1k or 4k.

nikomatsakis commented 2 years ago

One thing I considered: multiple lints for different size thresholds? Probably not so nice.

scottmcm commented 2 years ago

Multiple lints could be nice for having a couple of different thresholds, with correspondingly-different default lint levels.

As a strawman, thinking only about x64,

huge_assignments, deny-by-default, for (64 KiB, ∞)
large_assignments, warn-by-default, for (2 KiB, 64 KiB]
medium_assignments, allow-by-default, for (64 B, 2 KiB]

But absolutely my biggest uncertainty is how to specify the threshold, and how to scope it.

Perhaps the defaults should be something that the target definition gets to pick? I assume msp430-none-elf wants something very different from x86_64_v3-unknown-linux-gnu.

tmiasko commented 2 years ago

My general experience from testing the lint with limit starting from around a few kilobytes, is that it is very likely to lint about copies of large structs. This is actionable, because one can wrap the struct in a box or an arc, but leads to a situation where the initialization of a box or an arc will still be linted about, which is unfortunately generally unactionable currently

joshtriplett commented 2 years ago

I think, for a first pass, we should just have a single lint, and set the threshold high enough that people won't typically encounter it (e.g. 1k). If we find that people want the threshold lower, and want to have different behavior at different thresholds, we could talk about having multiple lints with different thresholds. But in practice I'd expect people to lower or raise the threshold, but not need multiple thresholds.

nikomatsakis commented 2 years ago

My general experience from testing the lint with limit starting from around a few kilobytes, is that it is very likely to lint about copies of large structs. This is actionable, because one can wrap the struct in a box or an arc, but leads to a situation where the initialization of a box or an arc will still be linted about, which is unfortunately generally unactionable currently

This is interesting. We may want to have certain "exemptions" from the lint.

b-ncMN commented 2 years ago

Is the second step currently being worked on ? Who is the mentor for this issue ? (I feel like trying a medium step)

oli-obk commented 2 years ago

I am mentoring this issue. Don't hesitate to ping me on zulip about any questions or requests that you may have

b-ncMN commented 2 years ago

The second step might need to get ticked since #95478 has been merged

pnkfelix commented 2 years ago

A couple thoughts that came up during RustConf today:

One particularly insidious case of these large moves is when its some local variable that is carried over a .await. Its an anti-pattern to have such local variables that end up stored in the generated Future, and we could do more to guide developers to avoid them. (Whether that guidance should be incorporated into the large_assignments lint, potentially as a special case that detects the assignment into a future, or if it should instead be part of a different lint entirely, I am not certain. For now it seems natural to me to correlated it with this lint.)
It might be nice to provide heuristic advice as to simple API changes that could be made to address the problem. For example, replacing a [T; N] array with a Vec<T> (and likewise convert the initialization expression from [<expr>; N] to vec![<expr>; N]) is a pretty simple way to address the problem, assuming that the cost of the copying far exceeds the cost of the boxing performed by the Vec<T>.

joshtriplett commented 2 years ago

I think the await case should be a separate lint, but I do think we should have it.

estebank commented 2 years ago

As implemented, #![deny(large_assignments)] relies on a flag being passed to rustc or on #![move_size_limit = N] being set at the crate level. Would it make sense to instead make move_size_limit an attribute accepted on any scope, and the lint having to walk up from every fn item upwards to see if the limit is set? (Impl details: You'd likely instead want to have some preprocessed DefId map to avoid O(n^2) search behavior, but the user experience is that of searching up.)

Is there any use-case where a different granularity would be desired? This would already be much more precise than the different lints with different thresholds proposal (which I also endorse as a potential shorter term solution).

One particularly insidious case of these large moves is when its some local variable that is carried over a .await. Its an anti-pattern to have such local variables that end up stored in the generated Future, and we could do more to guide developers to avoid them. (Whether that guidance should be incorporated into the large_assignments lint, potentially as a special case that detects the assignment into a future, or if it should instead be part of a different lint entirely, I am not certain. For now it seems natural to me to correlated it with this lint.)

It might be nice to provide heuristic advice as to simple API changes that could be made to address the problem. For example, replacing a [T; N] array with a Vec (and likewise convert the initialization expression from [; N] to vec![; N]) is a pretty simple way to address the problem, assuming that the cost of the copying far exceeds the cost of the boxing performed by the Vec.

I believe these cases can definitely be handled with targeted structured suggestions. It wouldn't help for user types (if they provide Huge on the stack and HugeBuf on the heap), but handling the std and some tokio types can be done.

I think the await case should be a separate lint, but I do think we should have it.

If so, should it also rely on move_size_limit?

Enselic commented 1 year ago

This seems like a good issue for me to work on a bit more long term, so I'll go ahead and claim this one.

@rustbot claim

RalfJung commented 1 year ago

Regarding the implementation, I am slightly concerned that this lint has logic that is entangled with compiler/rustc_monomorphize/src/collector.rs. That file does important and subtle things; is there any way we can avoid mixing that up with the lint logic?

Enselic commented 7 months ago

I managed to make some progress on this. However, other things have arisen that I would like to focus my attention on, so I will now unassign myself to unblock others from taking over the implementation of this feature. I want to thank my mentor oli-obk who was of great help during my work on this 🙏

The next natural step for this feature would be to fix https://github.com/rust-lang/rust/issues/121672. See the top issue description for a somewhat updated list of things to do after that.

@rustbot release-assignment

oli-obk commented 7 months ago

Thanks for the work you've done!

rust-lang / rust