Performance testing against ICU4C

unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.

https://icu4x.unicode.org

Other

1.37k stars 176 forks source link

Performance testing against ICU4C #66

Open echeran opened 4 years ago

echeran commented 4 years ago

The discussion in #40 regarding implementation touches on performance of existing ICU4C code vs. the pre-existing Rust module unicode-normalization.

Performance results enables comparison between implementations, which enables a decision on future implementation strategy. Performance results can be useful in general for its own sake as a measure of Rust code across changes, independent of Rust vs C comparisons.

Some of the aspects of performance testing:

Test input data set
- Where to gather info?
- Should the test data be representative of typical use cases or a stress test (worst-case)?
- How much is enough?
Testing framework
- Rust-only: cargo bench provides a way of running benchmarks for running test code
- Rust vs. C comparisons - how is a fiar comparison achieved? Is calling ICU4C from Rust via FFI acceptable?

zbraniecki commented 4 years ago

Notice that the "default" cargo bench infra is considered outdated and obsolete. https://github.com/bheisler/criterion.rs is the go to for microbenchmarks today until Rust updates to it by default :)

echeran commented 4 years ago

For the case of locale normailzation test data, the suggestions from the group:

choose data from Wikipedia for:
- Czech
- Norwegian
- Korean
- Vietnamese (difference between IME and non-IME text)

filmil commented 4 years ago

I would recommend adding samples from each major group of scripts, so that we're tracking performance metrics that are of interest for real users.

For example, including Cyrillic, CJK, and indic scripts into the list above would be useful.

hsivonen commented 4 years ago

Vietnamese (difference between IME and non-IME text)

To synthesize as-if from Vietnamese IME from Wikipedia, normalize the Vietnamese Wikipedia to NFC (so then perf test to NFC tests normalizing what's already normalized). To synthesize as-if from Vietnamese keyboard layout, normalize to NFC and then run through detone with orthographic set to true.

zbraniecki commented 4 years ago

@echeran - do you think we should close this now?

The one thing we don't have in icu4x is comparisons to ICU4C, but I'm maintaining the ICU4C vs ICU4X in https://github.com/zbraniecki/intl-measurements/

Is that enough for now?

sffc commented 3 years ago

Deliverable: Document the results in a markdown file and iterate on it for every ICU4X release.

sffc commented 3 years ago

2021-04-30: @mihnita - Would be good to compare to ICU4C compiled on Linux but also on MSVC, which has different optimizations.

sffc commented 2 years ago

The deliverable here is to make a polished tool to compare ICU4X and ICU4C performance, integrated with CI, etc. We have scrappy benchmarks from @zbraniecki. @gregtatum points out that some of this testing can be done at a higher integration testing level. @sffc points out that this is related to #51, behavior testing between C and X. It could also be implemented via ecma-402 traits and rust_icu.