Closed jyn514 closed 2 years ago
This isn't really a false positive. Could be
let mut arg_strings: Vec<Box<str>> = vec![name.to_owned().into_boxed_str()];
for arg in args {
arg_strings.push(
arg.as_ref()
.to_os_string()
.into_string()
.unwrap()
.into_boxed_str(),
);
}
What about if you have code of the form:
let mut foo = Vec::new();
foo.push("bar".to_string());
for x in 0..9 {
foo.push(format!("baz {}", x));
}
foo.push("wibble".to_string());
.....
The pattern here is clear as pushes, but if you move the first push into a vec![]
then the pattern is less clean. Also vec![]
essentially just does the new+push pair clippy is wanting to replace, with no performance benefit to be drawn.
The example in the lint shows the removal of mut
as a result of doing this work, and realistically I think the lint should only fire if it'd be possible to remove the mut
as a result of the refactor, or at least if there are zero push calls as a result.
Really Vec::with_capacity
should be used in this case from a performance perspective. It would also stop this lint from triggering. That's a whole different discussion though.
In this case I guess if there's only one or two items being pushed before a loop pushing items (or extend, and maybe others) it might be better to leave it as is.
I think the original request is reasonable. The lint is really only valuable if you can put all the elements in vec!
.
From a performance perspective it's always better to use the macro with the current Vec
implementation. A default Vec
starts with no capacity and increases by powers of two, starting at 1. So even the macro with a single item avoids that first capacity check.
That's not true (that it increases from a starting capacity of 1). I checked the source. It starts with zero, and then jumps to a capacity of 8 for u8
, or 4 for anything else under 1KB. I have a situation where I create a Vec, add one item and then add several more in a loop. The clippy suggestion to switch to vec!
will produce less-efficient code, since (checking the source) the vec!
creates a box (of size one) which is then converted into a Vec of capacity one.
So my original code creates an empty Vec, then allocates space for 4 items, and so on. Switching to vec!
, it would allocate space for 1 item, then jump to 4 and so on. So that's one extra unnecessary malloc/free with the clippy suggestion.
I agree with the original request: Using vec!
should only be suggested when nothing else will be added to the Vec later on. Or if you see more than 4 pushes immediately after the Vec creation (with current std library implementation), it would also be worthwhile switching to vec!
even if there are pushes later on. Otherwise it becomes an annoying clippy lint that has to be disabled. Thanks
[Here's a playground link](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&code=fn%20main()%20%7B%0A%20%20%20%20let%20mut%20v1%3A%20Vec%3Cu64%3E%20%3D%20Vec%3A%3Anew()%3B%0A%20%20%20%20v1.push(1)%3B%0A%20%20%20%20println!(%22v1%20%7B%7D%22%2C%20v1.capacity())%3B%0A%20%20%20%20v1.push(2)%3B%0A%20%20%20%20println!(%22v1%20%7B%7D%22%2C%20v1.capacity())%3B%0A%0A%20%20%20%20let%20mut%20v2%20%3D%20vec!%5B1%5D%3B%0A%20%20%20%20println!(%22v2%20%7B%7D%22%2C%20v2.capacity())%3B%0A%20%20%20%20v2.push(2)%3B%0A%20%20%20%20println!(%22v2%20%7B%7D%22%2C%20v2.capacity())%3B%0A%7D%0A) that demonstrates the problem. Assuming that the push of 2 occurs later, the clippy suggestion costs an extra malloc/free.
@uazu Good point, that adds some nuance.
Or if you see more than 4 pushes immediately after the Vec creation (with current std library implementation)
I don't think we should be dependent on the std implementation to that level of detail.
Looks like I was looking at old sources.
Here's an idea. We can split into two lints.
Vec::new
is followed by a constant number of push
es - should use vec!
Vec::new
is followed by a dynamic number of push
es - should use Vec::with_capacity
Yes, case 1 seems like a clear win. Also using vec!
in this case won't over-allocate (i.e. it will allocate exactly the right number, unlike pushing to a Vec).
Case 2 depends. Sometimes it's not possible to determine the required capacity in advance. Can clippy detect when the capacity could have been determined by the coder? That seems like quite an involved analysis. Sometimes Vec's behaviour is exactly what you want, i.e. optimised for short lists, but can still handle big lists if required, i.e. it scales smoothly.
Can clippy detect when the capacity could have been determined by the coder? That seems like quite an involved analysis.
I think it is theoretically always possible to calculate the number of elements pushed within the function before push
ing. But oftentimes it is too complex and not worth it. If the Vec is returned or passed mutably to another function, all bets are off. The really simple cases may be better written using collect
.
Sometimes Vec's behaviour is exactly what you want
Very true. That means a lot of false positives for case 2. It would definitely not be warn-by-default. But I'm starting to think we should not lint those cases at all - which just goes back to the original request...
I think it is theoretically always possible to calculate the number of elements pushed within the function before pushing.
@camsteffen only if the content is coming from a ExactSizeIterator or similarly capable source. For instance, an iterator over stdin lines() stopping at the first blank line isn't able to estimate the size at all without buffering (presumably in another vector, giving this issue again :P).
Lint name: vec_init_then_push
I tried this code:
I expected to see this happen: The lint doesn't trigger, because using
vec![]
and then immediately pushing doesn't improve the code any, it uses two different styles of initializing the vector.Instead, this happened:
cc https://github.com/rust-lang/rustup/pull/2718, @kinnison
Meta
cargo clippy -V
: clippy 0.1.53 (dae9d6a 2021-04-09)rustc -v
: rustc 1.53.0-nightly (dae9d6ac3 2021-04-09)