apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6k stars 1.14k forks source link

Consider using `Arc::clone` to clone Arcs to make it clear they aren't deep copies #11143

Open alamb opened 3 months ago

alamb commented 3 months ago

Is your feature request related to a problem or challenge?

in https://github.com/apache/datafusion/pull/11103#discussion_r1653071246 @comphead noted that it is sometimes hard to tell if .clone() is a deep clone (and thus expensive) or a clone of an Arc and thus much less so

when I see .clone() Im now thinking what if we comment it the similar way as .unwrap() back in the day. Like say clone is cheap here because it is Arc::clone so only reference gets cloned.... thats just an idea, its too cumbersome to make it happen

Describe the solution you'd like

It would be nice to make it clearer in the code when we had deep clones and when we were just cloning arcs

Describe alternatives you've considered

One thing we do in InfluxDB is use this pattern to make it exlicit

            Arc::clone(ctx.state().catalog_list())

There is a clippy lint https://rust-lang.github.io/rust-clippy/v0.0.212/#clone_on_ref_ptr we could turn on in datafusion to enable this

Here is how it is done in influxdb:

https://github.com/influxdata/influxdb3_core/blob/0f5ecbd6b17f83f7ad4ba55699fc2cd3e151cf94/Cargo.toml#L117-L118

Additional context

No response

jayzhan211 commented 3 months ago

Wow, this clippy lint is so helpful