messense / jieba-rs

The Jieba Chinese Word Segmentation Implemented in Rust
MIT License
738 stars 46 forks source link

Switch DAG from BTreeMap to Vec #34

Closed messense closed 5 years ago

messense commented 5 years ago

Closes #26


     Running target/release/deps/jieba_benchmark-1e4bf3354c7bae5e
jieba cut no hmm        time:   [7.3711 us 7.4326 us 7.5078 us]
                        change: [-22.921% -21.465% -20.074%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe

jieba cut with hmm      time:   [12.202 us 12.583 us 13.033 us]
                        change: [-11.663% -10.405% -8.6735%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) high mild
  8 (8.00%) high severe

jieba cut_all           time:   [5.4665 us 5.5778 us 5.6974 us]
                        change: [-15.967% -14.747% -13.371%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

jieba cut_for_search    time:   [14.771 us 14.817 us 14.862 us]
                        change: [-8.6548% -8.1527% -7.6992%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe

jieba tokenize default mode
                        time:   [12.267 us 12.350 us 12.483 us]
                        change: [-11.255% -10.689% -10.053%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 24 outliers among 100 measurements (24.00%)
  7 (7.00%) low severe
  6 (6.00%) low mild
  8 (8.00%) high mild
  3 (3.00%) high severe

jieba tokenize search mode
                        time:   [12.896 us 12.932 us 12.968 us]
                        change: [-11.976% -11.490% -10.967%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) low mild
  4 (4.00%) high mild
  4 (4.00%) high severe

jieba tag               time:   [12.411 us 12.475 us 12.548 us]
                        change: [-11.785% -11.228% -10.711%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) low mild
  4 (4.00%) high mild
  3 (3.00%) high severe

jieba tfidf             time:   [13.938 us 13.995 us 14.056 us]
                        change: [-13.222% -12.748% -12.299%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

jieba textrank          time:   [35.468 us 35.580 us 35.692 us]
                        change: [-5.1413% -4.6924% -4.2600%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

weicheng example run time reduced to 5s.

codecov[bot] commented 5 years ago

Codecov Report

Merging #34 into master will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #34   +/-   ##
=======================================
  Coverage   96.73%   96.73%           
=======================================
  Files           3        3           
  Lines         184      184           
=======================================
  Hits          178      178           
  Misses          6        6

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update e77480d...0e97315. Read the comment docs.