orium / rpds

Rust persistent data structures
Mozilla Public License 2.0
1.28k stars 58 forks source link

Reuse Arc allocation if the ref count is one #9

Closed Marwes closed 6 years ago

Marwes commented 6 years ago

POC of using Arc::make_mut to avoid allocating new Arcs unnecesarily. Can speedup mutating operations quite significantly on certain workloads.

The same approach could be used on most or all of the other datastructures as well though the difficulty of rewriting may vary.

Before

running 4 tests
test vector_drop_last ... bench: 130,201,140 ns/iter (+/- 36,244,874)
test vector_get       ... bench:   1,184,598 ns/iter (+/- 1,617,973)
test vector_iterate   ... bench:     867,704 ns/iter (+/- 1,049,963)
test vector_push_back ... bench: 144,737,691 ns/iter (+/- 37,627,413)

test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured

After

running 6 tests
test vector_drop_last     ... bench: 121,274,205 ns/iter (+/- 42,464,999)
test vector_drop_last_mut ... bench:   6,236,952 ns/iter (+/- 5,259,908)
test vector_get           ... bench:   1,346,970 ns/iter (+/- 1,673,853)
test vector_iterate       ... bench:     941,329 ns/iter (+/- 1,300,796)
test vector_push_back     ... bench: 166,950,667 ns/iter (+/- 50,590,077)
test vector_push_back_mut ... bench:  12,237,360 ns/iter (+/- 8,773,744)

test result: ok. 0 passed; 0 failed; 0 ignored; 6 measured
$ cargo benchcmp old new
 name              old ns/iter  new ns/iter  diff ns/iter   diff %  speedup
 vector_drop_last  130,201,140  110,503,419   -19,697,721  -15.13%   x 1.18
 vector_get        1,184,598    1,143,228         -41,370   -3.49%   x 1.04
 vector_iterate    867,704      888,491            20,787    2.40%   x 0.98
 vector_push_back  144,737,691  128,403,711   -16,333,980  -11.29%   x 1.13
Marwes commented 6 years ago

Sorry about the accidental rustfmting. I can revert it but just wanted to post this as-is for now.

Before and after for the non-mutating methods

$ cargo benchcmp old new
 name              old ns/iter  new ns/iter  diff ns/iter  diff %  speedup
 vector_drop_last  108,379,948  108,880,821       500,873   0.46%   x 1.00
 vector_get        1,097,490    1,129,746          32,256   2.94%   x 0.97
 vector_iterate    832,342      864,973            32,631   3.92%   x 0.96
 vector_push_back  112,191,792  130,402,698    18,210,906  16.23%   x 0.86
codecov-io commented 6 years ago

Codecov Report

Merging #9 into master will increase coverage by 0.2%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master       #9     +/-   ##
=========================================
+ Coverage   95.24%   95.45%   +0.2%     
=========================================
  Files          12       12             
  Lines        2084     2090      +6     
=========================================
+ Hits         1985     1995     +10     
+ Misses         99       95      -4
Impacted Files Coverage Δ
src/sequence/vector/mod.rs 97.77% <100%> (+1.16%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update f125604...b9d2234. Read the comment docs.

orium commented 6 years ago

Yeah, the speedup can be quite drastic if you have no structural sharing (or not a lot of it). However I would like to keep the API nice and clean.

My idea is to have a VectorMut with the mutable methods and Vector would just be a small wrapper around it, where all its methods would be immutable except for a pub fn mutable(&mut self) -> &mut VectorMut<T>. This way you can choose to:

  1. Use only Vector<T> and have a nice functional way of using the data structure. Performance will not be as good.
  2. Use only VectorMut<T> for the best performance. Cloning is cheap if you need to keep previous versions of the data structure.
  3. Use Vector<T> but when you don't need to keep previous versions you can call vector.mutable().push_back() to get the best performance.
Marwes commented 6 years ago

Is it really worth it to create a separate type just to keep the interface of the "immutable only" types small? The "immutable only" type has nothing preventing them from supporting the methods and there is not that many methods that needs a mutable variant either. For instance, Vectorneedspush_back,set, ``drop_last which is only 2 methods more than adding a mutable method.

Anyway, I made this mostly out of curiosity of how low the overhead would be with this strategy :) I can clean this Vector implementation up and finish the other methods if you want it but otherwise feel free to close or pick apart whatever parts are useful.

orium commented 6 years ago

Is it really worth it to create a separate type just to keep the interface of the "immutable only" types small?

It is more about having the interface consistent, but I haven't made my mind yet. For now I'm happy with merging this after a cleanup.

Marwes commented 6 years ago

Implemented drop_last_mut as well as a few assorted that fell out of mutability (or I just noticed them as I went along).

orium commented 6 years ago

The changes made by rustfmt are still present, but it gave me the oportunity to rustfmt the code in master (I just landed that). So, before you rebase create a .rustfmt.toml in the project root with

write_mode = "Diff"

comment_width = 100
struct_field_align_threshold = 4
use_field_init_shorthand = true
use_try_shorthand = true
match_arm_blocks = false

then do

cargo +nightly fmt -- --write-mode=overwrite

After this commit the changes and you should be able to rebase from master cleanly.

Marwes commented 6 years ago

A bit late but I finally remembered to do the rebase.

orium commented 6 years ago

Looks good! I have only a few minor changes, but since I will splitting this in Vector/VectorMut (see https://github.com/orium/rpds/tree/mutable) those changes will get ironed out then.

For now I'm happy to land this after a new rebase on master and getting those commits squashed into a single one.

Marwes commented 6 years ago

Done

orium commented 6 years ago

Awesome! Thanks!