sharksforarms / deku

Declarative binary reading and writing: bit-level, symmetric, serialization/deserialization
Apache License 2.0
1.15k stars 55 forks source link

Upgrade to bitvec 1.0.0 #246

Closed JuanPotato closed 2 years ago

JuanPotato commented 2 years ago

What changes I made

  1. BitSlice<Msb0, u8> -> BitSlice<u8, Msb0>
  2. get_raw_slice is gone, use domain
  3. offset_from is unsafe and needs to be called on BitPtr and is reversed

Not related to the upgrade but it made my tests pass. DekuData.vis was never read and that emitted a warning. Cargo test didn't like this extra warning so I renamed it to _vis for now.

test tests/test_compile/cases/attribute_token_stream.rs ... mismatch

EXPECTED:
┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈
error[E0308]: mismatched types
 --> $DIR/attribute_token_stream.rs:5:19
  |
5 |     #[deku(cond = "0 == true")]
  |                   ^^^^^^^^^^^ expected integer, found `bool`

error[E0277]: can't compare `{integer}` with `bool`
 --> $DIR/attribute_token_stream.rs:5:19
  |
5 |     #[deku(cond = "0 == true")]
  |                   ^^^^^^^^^^^ no implementation for `{integer} == bool`
  |
  = help: the trait `PartialEq<bool>` is not implemented for `{integer}`
┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈

ACTUAL OUTPUT:
┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈
warning: field is never read: `vis`
   --> deku-derive/src/lib.rs
    |
    |     vis: syn::Visibility,
    |     ^^^^^^^^^^^^^^^^^^^^
    |
    = note: `#[warn(dead_code)]` on by default

error[E0308]: mismatched types
 --> tests/test_compile/cases/attribute_token_stream.rs:5:19
  |
5 |     #[deku(cond = "0 == true")]
  |                   ^^^^^^^^^^^ expected integer, found `bool`

error[E0277]: can't compare `{integer}` with `bool`
 --> tests/test_compile/cases/attribute_token_stream.rs:5:19
  |
5 |     #[deku(cond = "0 == true")]
  |                   ^^^^^^^^^^^ no implementation for `{integer} == bool`
  |
  = help: the trait `PartialEq<bool>` is not implemented for `{integer}`
┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈
note: If the actual output is the correct output you can bless it by rerunning
      your test with the environment variable TRYBUILD=overwrite

closes #243

wcampbell0x2a commented 2 years ago

Running the benchmarks (which aren't a full holistic picture of this library), gives me some worrying numbers. :grimacing: These are ran on my cheap laptop:

origin/master..JuanPotato/master

deku_read_byte          time:   [35.971 ns 36.308 ns 36.661 ns]
                        change: [+197.36% +202.61% +208.23%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

deku_write_byte         time:   [84.430 ns 85.312 ns 86.242 ns]
                        change: [+23.911% +26.359% +28.970%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

deku_read_enum          time:   [58.737 ns 59.303 ns 59.910 ns]
                        change: [+172.07% +178.52% +184.68%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

deku_write_enum         time:   [143.86 ns 145.50 ns 147.24 ns]
                        change: [+28.792% +31.542% +34.688%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

deku_read_vec           time:   [2.4821 us 2.5039 us 2.5270 us]
                        change: [+130.90% +134.30% +137.80%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe

deku_write_vec          time:   [6.2461 us 6.3132 us 6.3845 us]
                        change: [+28.856% +31.085% +33.302%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe

origin/master for reference:

deku_read_byte          time:   [11.810 ns 11.942 ns 12.085 ns]
Found 10 outliers among 100 measurements (10.00%)
  7 (7.00%) high mild
  3 (3.00%) high severe

deku_write_byte         time:   [67.383 ns 68.076 ns 68.816 ns]
Found 9 outliers among 100 measurements (9.00%)
  6 (6.00%) high mild
  3 (3.00%) high severe

deku_read_enum          time:   [20.725 ns 20.979 ns 21.251 ns]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

deku_write_enum         time:   [110.09 ns 111.23 ns 112.46 ns]
Found 13 outliers among 100 measurements (13.00%)
  8 (8.00%) high mild
  5 (5.00%) high severe

deku_read_vec           time:   [1.0543 us 1.0685 us 1.0835 us]
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

deku_write_vec          time:   [4.7405 us 4.7984 us 4.8617 us]
Found 12 outliers among 100 measurements (12.00%)
  7 (7.00%) high mild
  5 (5.00%) high severe

@myrrlyn @sharksforarms

wcampbell0x2a commented 2 years ago

Running benchmarks on my adsb library after switching deku to this branch:

lax_messsages           time:   [1.5287 s 1.5305 s 1.5330 s]
                        change: [+56.845% +57.120% +57.449%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  1 (1.00%) low severe
  5 (5.00%) low mild
  5 (5.00%) high mild
  1 (1.00%) high severe
JuanPotato commented 2 years ago

Thank you for the feedback! Yeah that's not great performance, and I'm also thinking my code might have some bugs in it in very specific cases, Will add some new tests and see.

Also not sure if bitvec itself regressed or if our usage of it is not ideal.

wcampbell0x2a commented 2 years ago

Thank you for the feedback! Yeah that's not great performance, and I'm also thinking my code might have some bugs in it in very specific cases, Will add some new tests and see.

Also not sure if bitvec itself regressed or if our usage of it is not ideal.

There is this issue, it might not be related to our performance regression however. https://github.com/bitvecto-rs/bitvec/issues/158

sharksforarms commented 2 years ago

Thanks for this work! Let's see if we can figure out the performance issue before merging.

DerFetzer commented 2 years ago

I ran the benchmark executables with perf. bitvec::Domain for sure is part of the problem.

master: grafik

JuanPotato:master: grafik

Just by refactoring https://github.com/JuanPotato/deku/blob/09427b8eb5e39402c4dec49070e9ae0a5ecbe20d/src/impls/primitive.rs#L35 with a match and thus reusing bit_slice.domain().region().unwrap().1 the performance improved again up to 50%. grafik

But even this is of course still much slower than the master branch.

JuanPotato commented 2 years ago

At some point I began working on a replacement to bitvec that would be much more performant for deku's usage. I got it looking pretty good but it required a lot of refactoring and is definitely a breaking change. Can't remember how the performance compares to master though

wcampbell0x2a commented 2 years ago

@myrrlyn I see you updated bitvec w.r.t. performance with the https://github.com/bitvecto-rs/bitvec/commit/4de627722c6e5b31107e2765fc2ebd7d8c69a84c commit.

When updating the Cargo.toml of this update branch

diff --git a/Cargo.toml b/Cargo.toml
index 14d1a0a..ebfb0c0 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -26,7 +26,7 @@ const_generics = []

 [dependencies]
 deku_derive = { version = "^0.12.0", path = "deku-derive", default-features = false}
-bitvec = { version = "1.0.0", default-features = false }
+bitvec = { git = "https://github.com/bitvecto-rs/bitvec", default-features = false }

I get performance boosts vs the v1 version of bitvec:

deku_read_byte          time:   [15.993 ns 16.075 ns 16.167 ns]
                        change: [-21.903% -21.507% -20.952%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

deku_write_byte         time:   [53.931 ns 54.056 ns 54.216 ns]
                        change: [-2.3317% -1.1832% +0.5395%] (p = 0.10 > 0.05)
                        No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
  3 (3.00%) high mild
  13 (13.00%) high severe

deku_read_enum          time:   [27.867 ns 28.048 ns 28.236 ns]
                        change: [-22.115% -21.262% -20.134%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 26 outliers among 100 measurements (26.00%)
  15 (15.00%) low mild
  6 (6.00%) high mild
  5 (5.00%) high severe

deku_write_enum         time:   [91.762 ns 91.857 ns 91.960 ns]
                        change: [-2.7815% -0.2585% +3.3560%] (p = 0.88 > 0.05)
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) low mild
  3 (3.00%) high mild
  4 (4.00%) high severe

deku_read_vec           time:   [1.0901 us 1.0914 us 1.0927 us]
                        change: [-36.372% -35.693% -35.277%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

deku_write_vec          time:   [4.4868 us 4.4926 us 4.4985 us]
                        change: [+4.6790% +5.9749% +7.6922%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  2 (2.00%) high mild
  5 (5.00%) high severe

However, still not the kind of performance that current deku provides:

deku_read_byte          time:   [6.4593 ns 6.4657 ns 6.4736 ns]
                        change: [-60.561% -60.293% -60.044%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  8 (8.00%) high severe

deku_write_byte         time:   [40.263 ns 40.318 ns 40.370 ns]
                        change: [-27.729% -26.536% -25.703%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe

deku_read_enum          time:   [11.972 ns 11.984 ns 11.996 ns]
                        change: [-58.424% -57.816% -57.379%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  2 (2.00%) high severe

deku_write_enum         time:   [66.074 ns 66.155 ns 66.233 ns]
                        change: [-31.005% -27.837% -24.929%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

deku_read_vec           time:   [638.18 ns 638.77 ns 639.49 ns]
                        change: [-41.653% -41.444% -41.180%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low severe
  1 (1.00%) high mild
  4 (4.00%) high severe

deku_write_vec          time:   [2.8597 us 2.8650 us 2.8723 us]
                        change: [-37.915% -36.792% -35.948%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  3 (3.00%) high mild
  9 (9.00%) high severe
myrrlyn commented 2 years ago

By any chance, does the faster old version have #[inline] on non-public functions

wcampbell0x2a commented 2 years ago

By any chance, does the faster old version have #[inline] on non-public functions

I'm not at my computer, but I think this branch is the code that's used for our code. Looks like everything possible is inlined?

https://github.com/bitvecto-rs/bitvec/tree/v0/22/3/src

wcampbell0x2a commented 2 years ago

as of bitvec v1.0.1, compared to old bitvec:

deku_read_byte          time:   [15.651 ns 15.732 ns 15.806 ns]
                        change: [+93.359% +94.661% +95.905%] (p = 0.00 < 0.05)
                        Performance has regressed.

deku_write_byte         time:   [54.336 ns 54.673 ns 55.005 ns]
                        change: [+24.738% +25.686% +26.695%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

deku_read_enum          time:   [27.266 ns 27.461 ns 27.667 ns]
                        change: [+80.433% +83.093% +85.122%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

deku_write_enum         time:   [89.895 ns 90.558 ns 91.282 ns]
                        change: [+26.135% +27.569% +28.872%] (p = 0.00 < 0.05)
                        Performance has regressed.

deku_read_vec           time:   [1.3311 us 1.3361 us 1.3412 us]
                        change: [+95.736% +100.29% +105.39%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

deku_write_vec          time:   [3.9468 us 3.9727 us 4.0026 us]
                        change: [+29.954% +31.057% +32.167%] (p = 0.00 < 0.05)
                        Performance has regressed.
wcampbell0x2a commented 2 years ago

Better https://github.com/wcampbell0x2a/deku/tree/update-bitvec-1.x.x:

deku_read_byte          time:   [11.777 ns 11.799 ns 11.822 ns]
deku_write_byte         time:   [50.278 ns 50.330 ns 50.383 ns]
deku_read_enum          time:   [20.374 ns 20.398 ns 20.422 ns]
deku_write_enum         time:   [84.699 ns 84.795 ns 84.896 ns]
deku_read_vec           time:   [991.05 ns 992.97 ns 995.64 ns]
deku_write_vec          time:   [3.7768 µs 3.7814 µs 3.7862 µs]
sharksforarms commented 2 years ago

Superseded by https://github.com/sharksforarms/deku/pull/281