Closed KonradHoeffner closed 3 months ago
Removing the "simd" feature doesn't hurt performance that much, so disable it for now:
$ cargo bench --bench iai
Finished bench [optimized] target(s) in 0.03s
Running benches/iai.rs (target/release/deps/iai-14f5bcf861e0aa04)
load
Instructions: 22425353 (-0.003193%)
L1 Accesses: 28365288 (-0.364770%)
L2 Accesses: 49159 (-0.366842%)
RAM Accesses: 14752 (-0.027108%)
Estimated Cycles: 29127403 (-0.358822%)
query_all
Instructions: 100324003 (+0.032336%)
L1 Accesses: 126508543 (-0.050547%)
L2 Accesses: 75061 (+0.399936%)
RAM Accesses: 14703 (-0.392927%)
Estimated Cycles: 127398453 (-0.050614%)
query_all_sophia
Instructions: 112690609 (+0.001438%)
L1 Accesses: 144199404 (-0.079763%)
L2 Accesses: 109364 (+14.81423%)
RAM Accesses: 16972 (No change)
Estimated Cycles: 145340244 (-0.030646%)
query_po
Instructions: 26805066 (-0.000530%)
L1 Accesses: 33790638 (-0.343344%)
L2 Accesses: 50843 (+0.917012%)
RAM Accesses: 14690 (-0.608931%)
Estimated Cycles: 34559003 (-0.338151%)
query_po_sophia
Instructions: 26961723 (-0.002533%)
L1 Accesses: 34034329 (-0.342070%)
L2 Accesses: 50251 (-1.889924%)
RAM Accesses: 14821 (+0.850572%)
Estimated Cycles: 34804319 (-0.335855%)
query_o
Instructions: 27469497 (+0.007493%)
L1 Accesses: 34687868 (-0.327179%)
L2 Accesses: 50256 (-0.039780%)
RAM Accesses: 14775 (-0.027065%)
Estimated Cycles: 35456273 (-0.320784%)
query_o_sophia
Instructions: 30578845 (+0.005282%)
L1 Accesses: 38782526 (-0.294905%)
L2 Accesses: 51323 (+2.247236%)
RAM Accesses: 14684 (-0.864164%)
Estimated Cycles: 39553081 (-0.286261%)
$ cargo bench -q --bench criterion "triple IDs" -- --quick
??? (all)/0.1 all triple IDs
time: [1.4357 s 1.4364 s 1.4366 s]
change: [-1.5371% -0.7365% +0.0767%] (p = 0.65 > 0.05)
No change in performance detected.
S??/1.1 (vincent, ?, ?) triple IDs
time: [1.4638 µs 1.4642 µs 1.4643 µs]
change: [-5.4005% -3.4320% -1.3807%] (p = 0.08 > 0.05)
No change in performance detected.
?P? 1429618 triples/2.1 (?, type, ?) triple IDs
time: [335.56 ms 335.58 ms 335.67 ms]
change: [-0.2186% -0.1766% -0.1345%] (p = 0.07 > 0.05)
No change in performance detected.
??O 1414214 triples/3.1 (?, ?, person) triple IDs
time: [143.02 ms 143.21 ms 144.00 ms]
change: [-3.9287% -2.9968% -2.0532%] (p = 0.07 > 0.05)
No change in performance detected.
?PO 1414214 triples/4.1 (?, type, person) triple IDs
time: [116.37 ms 116.39 ms 116.48 ms]
change: [-4.1821% -2.4701% -0.6975%] (p = 0.08 > 0.05)
No change in performance detected.
hdt$ cargo bench -q --bench criterion "triple IDs" -- --quick
??? (all)/0.1 all triple IDs
time: [1.4435 s 1.4457 s 1.4463 s]
change: [+0.4779% +0.6080% +0.7381%] (p = 0.08 > 0.05)
No change in performance detected.
S??/1.1 (vincent, ?, ?) triple IDs
time: [1.4537 µs 1.4547 µs 1.4584 µs]
change: [-0.7223% -0.5441% -0.3659%] (p = 0.05 < 0.05)
Change within noise threshold.
?P? 1429618 triples/2.1 (?, type, ?) triple IDs
time: [336.54 ms 336.79 ms 336.86 ms]
change: [+0.2587% +0.3235% +0.3883%] (p = 0.10 > 0.05)
No change in performance detected.
??O 1414214 triples/3.1 (?, ?, person) triple IDs
time: [144.21 ms 144.31 ms 144.71 ms]
change: [+0.1414% +0.6598% +1.1818%] (p = 0.15 > 0.05)
No change in performance detected.
?PO 1414214 triples/4.1 (?, type, person) triple IDs
time: [118.51 ms 118.57 ms 118.81 ms]
change: [+1.7408% +1.9189% +2.0971%] (p = 0.10 > 0.05)
No change in performance detected.
Thanks for the quick analysis and fix! It looks like Option 1, "Disable the SIMD feature for now, but that would decrease the speed," is the way to go to get things compiling again. That would close this issue.
Would adding a separate new feature issue around SIMD to be addressed later be reasonable? Would some future performance improvement be expected at a larger scale of data, use of GPUs, or other SIMD accelerators? Or is the sense that SIMD does not do much algorithmically for HDT?
@donpellegrino thanks for the input! I disabled the SIMD feature in commit 225ff272506c1283c30d37f825bf2e043326dfb0 for now but I will keep this issue open because I suspect that rsdict will become updated in the near future as I already got it to compile with the new standard library SIMD feature and I expect that someone more knowledgeable about SIMD than me will be able to fix the errors with reasonable effort.
As for the potential impact of a general SIMD feature, that is hard for me to say. The benchmarks above lead me to believe that it does not do much and is best handled by using optimized external libraries like rsdict for rank and select, but I'm open to be positively surprised by a pull request by someone with SIMD expertise :-)
Update: @Cydhra was kind enough to fix the issues in the refactoring, so once @sujayakar merges https://github.com/sujayakar/rsdict/pull/10, we could reenable the SIMD feature of rsdict. If someone else had a pressing need for this they could use the rsdict fork of the pull request as a Git dependency in their Cargo.toml but at least for HDT the performance difference is so minuscule that it is not worth it. Also it would be great to have the rsdict maintainer check this PR. While the tests are successful, I don't know if there are edge cases that aren't covered, so I don't want to be responsible for any bugs that may cause.
I mean, since performance is a non-issue here, this likely won't matter, but there are several alternatives to rsdict which beat its performance even with simd
enabled. The best options are likely sucds, succinct if you only require rank
, and vers.
So if it ends up abandoned, those are the options.
I think sucds
also offers a compressed bitmap (though I haven't tested its space requirements), so if you rely on the compression of rsdict
that might work. If your vectors aren't sparse (so compression is ineffective anyway), vers
probably offers the smallest memory footprint overall.
For concrete micro-benchmarks, refer to https://github.com/Cydhra/vers
We already use sucds so if we could refactor to that this would reduce dependency count by one and if it faster then even better. However last time I checked, sucds didn't seem to have an equivalent to https://docs.rs/rsdict/latest/rsdict/struct.RsDict.html#method.from_blocks, I will check again.
I got rid of the rsdict dependency by switching to sucds following one of the options suggested by @Cydhra. However it is in a temporary "sucds" branch until I've made more performance measurements.
With the iai benchmark (10k triples very small dataset):
load
Instructions: 23788468 (+8.065720%)
L1 Accesses: 32290918 (+15.70495%)
L2 Accesses: 47687 (-3.586664%)
RAM Accesses: 14803 (-0.074254%)
Estimated Cycles: 33047458 (+15.25324%)
query_all
Instructions: 85571508 (-14.12827%)
L1 Accesses: 113483823 (-9.751437%)
L2 Accesses: 69219 (-6.529019%)
RAM Accesses: 14812 (-0.537201%)
Estimated Cycles: 114348338 (-9.704091%)
query_all_sophia
Instructions: 98162338 (-12.57198%)
L1 Accesses: 131583903 (-8.550159%)
L2 Accesses: 73267 (-13.50334%)
RAM Accesses: 17086 (-0.227737%)
Estimated Cycles: 132548248 (-8.530212%)
query_po
Instructions: 27698979 (+5.002010%)
L1 Accesses: 37528315 (+12.65199%)
L2 Accesses: 48375 (-3.654650%)
RAM Accesses: 14841 (-0.127860%)
Estimated Cycles: 38289625 (+12.33688%)
query_po_sophia
Instructions: 27869964 (+4.985088%)
L1 Accesses: 37797603 (+12.56838%)
L2 Accesses: 48595 (-3.289683%)
RAM Accesses: 14887 (-0.414743%)
Estimated Cycles: 38561623 (+12.25466%)
query_o
Instructions: 28356069 (+4.880660%)
L1 Accesses: 38418762 (+12.32590%)
L2 Accesses: 48525 (-3.282707%)
RAM Accesses: 14788 (-0.844844%)
Estimated Cycles: 39178967 (+12.01738%)
query_o_sophia
Instructions: 31452544 (+4.356829%)
L1 Accesses: 42495065 (+10.99454%)
L2 Accesses: 48468 (-3.667044%)
RAM Accesses: 14879 (-0.073875%)
Estimated Cycles: 43258170 (+10.75242%)
My suspicion is that the performance decrease in some of the tests is due to the cumbersome loading process like this:
pub fn new(data: Vec<u64>) -> Self {
let mut v = BitVector::new();
for d in data {let _ = v.push_bits(d as usize,64);}
let dict = Rank9Sel::new(v).select1_hints();
Bitmap { dict }
}
The loop calling push bits, which further applies some masking because it supports different numbers of bits, is probably much less efficient then just copying the usize blocks as was possible with rsdict.
The benchmarks show drastic performance gains, I will recheck it with https://github.com/KonradHoeffner/hdt_benchmark to make sure.
cargo bench --bench criterion
Queries persondata_en.hdt: 86MB, ~10M triples.
hdt$ cargo bench --bench criterion
Updating crates.io index
Compiling hdt v0.1.8 (/home/konrad/projekte/rust/hdt)
Finished `bench` profile [optimized] target(s) in 3.56s
Running benches/criterion.rs (target/release/deps/criterion-2ca20a72aa74cccc)
Benchmarking ??? (all)/0.1 all triple IDs: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 14.9s.
??? (all)/0.1 all triple IDs
time: [1.4515 s 1.4601 s 1.4691 s]
change: [+441.80% +451.86% +462.47%] (p = 0.01 < 0.05)
Performance has regressed.
Benchmarking ??? (all)/0.2 all str triples: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 44.8s.
??? (all)/0.2 all str triples
time: [4.3857 s 4.4032 s 4.4248 s]
change: [+31.960% +33.091% +34.267%] (p = 0.01 < 0.05)
Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
Benchmarking ??? (all)/0.3 all Sophia triples: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 51.2s.
??? (all)/0.3 all Sophia triples
time: [5.1053 s 5.1108 s 5.1159 s]
change: [+26.964% +27.099% +27.222%] (p = 0.01 < 0.05)
Performance has regressed.
S??/1.1 (vincent, ?, ?) triple IDs
time: [1.4701 µs 1.4711 µs 1.4722 µs]
change: [+673.36% +674.25% +675.14%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) high mild
2 (2.00%) high severe
S??/1.2 (vincent, ?, ?) str triples
time: [3.3508 µs 3.3527 µs 3.3548 µs]
change: [+56.702% +56.925% +57.148%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe
S??/1.3 (vincent, ?, ?) Sophia triples
time: [3.7053 µs 3.7077 µs 3.7101 µs]
change: [+50.744% +50.985% +51.248%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
?P? 1429618 triples/2.1 (?, type, ?) triple IDs
time: [334.24 ms 334.37 ms 334.52 ms]
change: [+135.63% +135.73% +135.86%] (p = 0.01 < 0.05)
Performance has regressed.
Benchmarking ?P? 1429618 triples/2.2 (?, type, ?) str triples: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.7s.
?P? 1429618 triples/2.2 (?, type, ?) str triples
time: [573.43 ms 573.77 ms 574.10 ms]
change: [+54.162% +54.340% +54.517%] (p = 0.01 < 0.05)
Performance has regressed.
Benchmarking ?P? 1429618 triples/2.3 (?, type, ?) Sophia triples: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.4s.
?P? 1429618 triples/2.3 (?, type, ?) Sophia triples
time: [644.39 ms 644.61 ms 644.82 ms]
change: [+40.394% +40.750% +41.109%] (p = 0.01 < 0.05)
Performance has regressed.
Benchmarking ??O 1414214 triples/3.1 (?, ?, person) triple IDs: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 14.3s, or reduce sample count to 30.
??O 1414214 triples/3.1 (?, ?, person) triple IDs
time: [143.18 ms 143.22 ms 143.26 ms]
change: [+332.33% +332.85% +333.39%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) low mild
3 (3.00%) high mild
1 (1.00%) high severe
Benchmarking ??O 1414214 triples/3.2 (?, ?, person) str triples: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 37.8s, or reduce sample count to 10.
??O 1414214 triples/3.2 (?, ?, person) str triples
time: [381.55 ms 383.15 ms 384.86 ms]
change: [+48.805% +49.427% +50.223%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
9 (9.00%) high mild
Benchmarking ??O 1414214 triples/3.3 (?, ?, person) Sophia triples: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 54.4s, or reduce sample count to 10.
??O 1414214 triples/3.3 (?, ?, person) Sophia triples
time: [538.96 ms 540.77 ms 542.66 ms]
change: [+31.487% +32.126% +32.751%] (p = 0.00 < 0.05)
Performance has regressed.
Benchmarking ?PO 1414214 triples/4.1 (?, type, person) triple IDs: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.7s or enable flat sampling.
?PO 1414214 triples/4.1 (?, type, person) triple IDs
time: [117.15 ms 117.34 ms 117.69 ms]
change: [+1613.6% +1640.1% +1668.3%] (p = 0.01 < 0.05)
Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
?PO 1414214 triples/4.2 (?, type, person) str subjects
time: [321.38 ms 324.29 ms 327.42 ms]
change: [+67.818% +71.746% +75.719%] (p = 0.01 < 0.05)
Performance has regressed.
?PO 1414214 triples/4.3 (?, type, person) str triples
time: [341.45 ms 344.35 ms 347.60 ms]
change: [+63.109% +64.445% +65.892%] (p = 0.00 < 0.05)
Performance has regressed.
?PO 1414214 triples/4.4 (?, type, person) Sophia triples
time: [341.47 ms 344.28 ms 347.53 ms]
change: [+52.392% +53.589% +54.991%] (p = 0.01 < 0.05)
Performance has regressed.
hdt$ cargo bench --bench criterion sucds
Compiling hdt v0.1.8 (/home/konrad/projekte/rust/hdt)
Finished `bench` profile [optimized] target(s) in 1.48s
Running benches/criterion.rs (target/release/deps/criterion-fd17cbde06dbe57c)
??? (all)/0.1 all triple IDs
time: [260.05 ms 260.55 ms 261.19 ms]
change: [-82.274% -82.156% -82.043%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
Benchmarking ??? (all)/0.2 all str triples: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 32.6s.
??? (all)/0.2 all str triples
time: [3.2486 s 3.2712 s 3.3039 s]
change: [-26.349% -25.709% -24.941%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 10 measurements (20.00%)
1 (10.00%) high mild
1 (10.00%) high severe
Benchmarking ??? (all)/0.3 all Sophia triples: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 40.6s.
??? (all)/0.3 all Sophia triples
time: [4.0177 s 4.0434 s 4.0729 s]
change: [-21.383% -20.885% -20.285%] (p = 0.00 < 0.05)
Performance has improved.
S??/1.1 (vincent, ?, ?) triple IDs
time: [188.85 ns 189.34 ns 189.87 ns]
change: [-87.181% -87.156% -87.131%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
S??/1.2 (vincent, ?, ?) str triples
time: [2.1806 µs 2.1861 µs 2.1914 µs]
change: [-35.051% -34.935% -34.809%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
S??/1.3 (vincent, ?, ?) Sophia triples
time: [2.4561 µs 2.4595 µs 2.4631 µs]
change: [-33.836% -33.719% -33.604%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
Benchmarking ?P? 1429618 triples/2.1 (?, type, ?) triple IDs: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 7.8s or enable flat sampling.
?P? 1429618 triples/2.1 (?, type, ?) triple IDs
time: [141.50 ms 142.14 ms 143.02 ms]
change: [-57.648% -57.514% -57.313%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 10 measurements (20.00%)
2 (20.00%) high mild
?P? 1429618 triples/2.2 (?, type, ?) str triples
time: [371.99 ms 372.23 ms 372.48 ms]
change: [-35.182% -35.126% -35.066%] (p = 0.00 < 0.05)
Performance has improved.
?P? 1429618 triples/2.3 (?, type, ?) Sophia triples
time: [437.91 ms 438.16 ms 438.53 ms]
change: [-32.076% -32.026% -31.962%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 10 measurements (20.00%)
1 (10.00%) high mild
1 (10.00%) high severe
??O 1414214 triples/3.1 (?, ?, person) triple IDs
time: [33.168 ms 33.187 ms 33.207 ms]
change: [-76.843% -76.828% -76.812%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
Benchmarking ??O 1414214 triples/3.2 (?, ?, person) str triples: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 25.6s, or reduce sample count to 10.
??O 1414214 triples/3.2 (?, ?, person) str triples
time: [254.84 ms 254.94 ms 255.04 ms]
change: [-33.761% -33.463% -33.181%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severe
Benchmarking ??O 1414214 triples/3.3 (?, ?, person) Sophia triples: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 40.5s, or reduce sample count to 10.
??O 1414214 triples/3.3 (?, ?, person) Sophia triples
time: [403.83 ms 404.04 ms 404.26 ms]
change: [-25.550% -25.285% -25.028%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
?PO 1414214 triples/4.1 (?, type, person) triple IDs
time: [6.7001 ms 6.7055 ms 6.7112 ms]
change: [-94.355% -94.304% -94.261%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 10 measurements (20.00%)
1 (10.00%) high mild
1 (10.00%) high severe
?PO 1414214 triples/4.2 (?, type, person) str subjects
time: [187.13 ms 187.77 ms 188.38 ms]
change: [-42.686% -42.100% -41.537%] (p = 0.00 < 0.05)
Performance has improved.
?PO 1414214 triples/4.3 (?, type, person) str triples
time: [211.85 ms 212.37 ms 212.98 ms]
change: [-38.921% -38.327% -37.777%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
?PO 1414214 triples/4.4 (?, type, person) Sophia triples
time: [229.08 ms 229.35 ms 229.65 ms]
change: [-34.005% -33.385% -32.825%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
Oh wow, how is that even possible?
Rank9Sel
has about the same overhead as RsDict
, so this can't be cache effects. But removing the simd
feature had almost no effect, so why does the increased performance suddenly make such a drastic change?
Would you mind pushing that branch, I am now personally interested how Vers would perform :)
Edit: The benchmarks don't come with their resource :(
Yeah I'm always skeptical with such large changes, so I will try to run it again using the hdt_benchmark suite. Feel free to test it with vers :-) The benchmark test resources can be generated either with this Makefile or manually:
wget http://downloads.dbpedia.org/2016-10/core-i18n/en/persondata_en.ttl.bz2
bunzip2 persondata_en.ttl.bz2
docker run -v ${PWD}:/data rdfhdt/hdt-cpp rdf2hdt /data/persondata_en.ttl /data/persondata_en.hdt
The reason they are not in the repo is that I didn't want to waste disk space for the people behind crates.io, maybe I find a better solution in the future.
P.S.: Tried to attach it here but GitHub allows 25 MB max. P.P.S.: You can also download it directly at https://github.com/KonradHoeffner/hdt/releases/download/benchmarkdata/persondata_en.hdt.bz2, see the readme.
I have similar results with vers like with sucds. A few percent gained in some benchmarks, a few lost in others. I also tried a method that just moves the data into the bitvec instead of copying the bits in a loop, but that didn't seem to make a difference, so I wouldn't worry about it.
@Cydhra: Thanks for investigating this! Did you try the iai or the criterion benchmark? I'm not sure why they show so much difference, maybe I should change the IAI benchmark to use a larger file as well, as this is what I'm optimizing for.
@Cydhra I created some comparative benchmark curves at https://github.com/KonradHoeffner/rdf_benchmark/blob/master/hdt_rs_current_vs_previous_versions.ipynb. I am now convinced that it is indeed measurably faster both in loading and querying, thank you for all the help! Oh and on top you can now use hdt_rs on stable Rust :-)
Did you try the iai or the criterion benchmark?
Criterion.
sorry for not responding earlier -- I've been busy with work and haven't had much time for side projects.
nice work benchmarking btw, and sucds
is indeed a fantastic library! we're actually using it at my startup for its compressed integer vectors and elias fano implementation: https://github.com/get-convex/convex-backend/blob/addaff23baa77b886d3736252615b149895d2ff4/crates/search/src/memory_index/term_list.rs#L121
Thank you! And no problem, I fully understand :-)
The rsdict dependency "simd" feature depends on packed_simd, which is no longer available in nightly 1.78. See https://github.com/sujayakar/rsdict/issues/9 and https://github.com/rust-lang/packed_simd/issues/359. I tried to refactor rsdict but wasn't successful yet, see https://github.com/sujayakar/rsdict/pull/10.
Options
Update: Option 1 was chosen but this issue is kept open in case of a future rsdict update that allows reenabling it.