dart-lang / core

This repository is home to core Dart packages.
https://pub.dev/publishers/dart.dev
BSD 3-Clause "New" or "Revised" License
9 stars 2 forks source link

Abysmal performance for hash algorithms in dart:crypto #180

Open DartBot opened 9 years ago

DartBot commented 9 years ago

Originally opened as dart-lang/sdk#4611

This issue was originally filed by dha...@google.com


Test code: new File('recordroyale.ogg').readAsBytes().then((buf) {   var md5 = new MD5();   md5.update(buf);   print(md5.digest()); });

where 'recordroyale.ogg' is about six megabytes. This takes 22 seconds on my machine. Shelling out to md5sum(1) takes under 0.02 seconds. I understand that Dart won't ever be as fast as C, but with this disparity, the hash algorithms should be implemented in C[++] rather than Dart.

I had a vague suspicion that this was slow due to unnecessary allocations, but the allocations are mostly small and consumed quickly. Refactoring to eliminate unnecessary allocations resulted in no change.

My use case involved checksumming a byte array that existed only in memory at the relevant location; it is far less convenient to write data to a temporary file, shell out to md5sum(1), then delete that file.

DartBot commented 9 years ago

Comment by larsbak


As a starting point we should add a benchmark to trace performance of MD5:

import("dart:io");

import("dart:crypto");

main() {   new File('data').readAsBytes().then((buf) {     var md5 = new MD5();     md5.update(buf);     print(md5.digest());   }); }


Set owner to @madsager. Added Area-Library label.

DartBot commented 9 years ago

Comment by madsager


There is a lot that can be done to improve performance here. The data is being copied around way too much. Also, the code overflows smi range and (at least it used to be the case that) a lot of the medium-size integer operations ended up in runtime calls.


cc @iposva-google. cc @sgmitrovic.

DartBot commented 9 years ago

Comment by madsager


Added Triaged label.

DartBot commented 9 years ago

Comment by iposva-google


Mads, let's work together at creating a repeatable test case. I don't think file reading is really needed here.

DartBot commented 9 years ago

Comment by lrhn


Removed Area-Library label. Added Area-Pkg, Library-Crypto labels.

DartBot commented 9 years ago

Comment by iposva-google


I wrote a little test just now and it looks like we can close this bug. Abysmal is certainly not what we see anymore:

import "dart:io"; import "package:crypto/crypto.dart";

readFileFully(var path) {   return new File(path).readAsBytesSync(); }

main() {   var binary = new Options().executable;   print("Using $binary");   for (int i = 0; i < 3; i++) {     var stopwatch = new Stopwatch()..start();     var bytes = readFileFully(binary);     stopwatch.stop();     print("Read ${bytes.length} bytes in ${stopwatch.elapsedMilliseconds} ms.");

    // Now compute the MD5 hash on the read bytes.     stopwatch = new Stopwatch()..start();     var md5 = new MD5();     md5.add(bytes);     var hash = md5.close();     stopwatch.stop();     print("Hashed ${bytes.length} bytes into ${hash.length} bytes in "           "${stopwatch.elapsedMilliseconds} ms.");     var sb = new StringBuffer();     for (int j = 0; j < hash.length; j++) {       var val = hash[j].toRadixString(16);       sb.write(val.length == 1 ? "0$val" : val);     }     print("Hash: $sb");   } }

Gives this set of results on my MacBook:

dalbe[runtime] ./xcodebuild/ReleaseX64/dart --package-root=./xcodebuild/ReleaseIA32/packages/ ~/dart/Bug4611.dart Using ./xcodebuild/ReleaseX64/dart Read 9875196 bytes in 66 ms. Hashed 9875196 bytes into 16 bytes in 810 ms. Hash: e5f175b851495b54b393327ac934583a Read 9875196 bytes in 18 ms. Hashed 9875196 bytes into 16 bytes in 802 ms. Hash: e5f175b851495b54b393327ac934583a Read 9875196 bytes in 33 ms. Hashed 9875196 bytes into 16 bytes in 816 ms. Hash: e5f175b851495b54b393327ac934583a

dalbe[runtime] time md5 ./xcodebuild/ReleaseX64/dart MD5 (./xcodebuild/ReleaseX64/dart) = e5f175b851495b54b393327ac934583a 0.026u 0.007s 0:00.02 100.0% 0+0k 0+1io 0pf+0w

This means we calculate the MD5 hash for a 9.41MB file in 810ms, which is significantly better than the reported 6MB in 22 seconds.


Added AssumedStale label.

DartBot commented 9 years ago

Comment by kevmoo


Removed Library-Crypto label. Added Pkg-Crypto label.

andreleblanc-wf commented 8 years ago

This may be a slightly contrived example, but I was recently comparing dart performance to python performance on a set of 'programming challenges' and am still seeing VERY weak performance from dart's MD5 hashing. the following code computes about 10 million hashes, and takes roughly 18-20 seconds on my macbook, whereas the equivalent python code (using hashlib) runs in under 5 second.

import "dart:io";
import "package:crypto/crypto.dart";

main(List<String> args) {
  MD5 md5;
  String text;
  String privKey = "yzbqklnj";
  String digest = "";
  List<int> bytes;
  int counter = 0;
  while (!digest.startsWith("000000")) {
    counter ++;
    text = privKey + counter.toString();
    md5 = new MD5();
    md5.add(text.codeUnits);
    bytes = md5.close();
    digest = CryptoUtils.bytesToHex(bytes);
  }
  print(counter);
}

of course hashlib is implemented in C, so the speed is to be expected, but it also clearly demonstrates that dart is nowhere near the same ballpark without implementing this stuff in C.

sethladd commented 8 years ago

Thanks for the report. Would you mind providing the version of Dart that you used? Did you have checked mode turned on? And, what version of crypto?

andreleblanc-wf commented 8 years ago

I actually just updated my comment, seems there was an issue with my initial python run, and the real number is closer to 5 seconds vs. dart at 20 seconds. this is using dart 1.13.0. The numbers are a little more reasonable now, but it's still a significant performance hit.

sethladd commented 8 years ago

Thanks for the additional comments. I'm not sure this issue should be closed. Reopening.

cc @sgmitrovic if he's curious.

scheglov commented 8 years ago

I see that *_MD5Sink.updateHash allocates millions on _Mint objects, which causes GC, and probably slow performance. I compute MD5 of a fairly small list of bytes, and MD5 computation takes almost 50% of total time in my app. Why isn't it inlined to use unboxed values?!

cachapa commented 4 years ago

Is this issue still being worked on? I'm still experiencing ~10x slower performance on MD5 compared to the native shell command line for large files (in my example, 121MB).

On my laptop (Ubuntu 19.10):

$ dart2native dart_md5.dart -o dart_md5
$ time dart_md5 video.mpg
54cd56fa1fcf144b21d48357e4e694a3

real    0m1,498s
user    0m1,584s
sys 0m0,038s

$ time md5sum video.mpg
54cd56fa1fcf144b21d48357e4e694a3  video.mpg

real    0m0,191s
user    0m0,187s
sys 0m0,004s

On my Raspberry Pi (raspbian):

$ dart2native dart_md5.dart -o dart_md5
$ time dart_md5 video.mpg
54cd56fa1fcf144b21d48357e4e694a3

real    0m7.923s
user    0m7.972s
sys 0m0.181s

$ time md5sum video.mpg
54cd56fa1fcf144b21d48357e4e694a3  video.mpg

real    0m0.646s
user    0m0.514s
sys 0m0.133s

Here's the relevant part of my code:

  static Future<String> _hash(File file) async =>
      (await md5.bind(file.openRead()).first).toString();

Dart VM version: 2.7.0

scheglov commented 2 days ago

It is still very slow in Dart.

https://pub.dev/packages/hashlib#benchmarks

MD5 | 168.39 MB/s
XXH128 | 119.32 MB/s

For comparison on Rust

MD5 104857600 bytes * 1 iterations: 179 ms, Speed: 557.3 MiB/s
SHA1 104857600 bytes * 1 iterations: 108 ms, Speed: 922.0 MiB/s
SHA256 104857600 bytes * 1 iterations: 290 ms, Speed: 344.0 MiB/s
xxHash64 104857600 bytes * 1 iterations: 3 ms, Speed: 29021.6 MiB/s
xxHash128 104857600 bytes * 1 iterations: 3 ms, Speed: 29704.1 MiB/s
MurmurHash3 104857600 bytes * 1 iterations: 27 ms, Speed: 3653.0 MiB/s
MD5 5242880 bytes * 10 iterations: 86 ms, Speed: 575.7 MiB/s
SHA1 5242880 bytes * 10 iterations: 52 ms, Speed: 947.0 MiB/s
SHA256 5242880 bytes * 10 iterations: 145 ms, Speed: 344.5 MiB/s
xxHash64 5242880 bytes * 10 iterations: 1 ms, Speed: 32948.0 MiB/s
xxHash128 5242880 bytes * 10 iterations: 1 ms, Speed: 32645.1 MiB/s
MurmurHash3 5242880 bytes * 10 iterations: 13 ms, Speed: 3734.6 MiB/s
MD5 1024 bytes * 5000 iterations: 9 ms, Speed: 530.7 MiB/s
SHA1 1024 bytes * 5000 iterations: 5 ms, Speed: 852.7 MiB/s
SHA256 1024 bytes * 5000 iterations: 15 ms, Speed: 315.6 MiB/s
xxHash64 1024 bytes * 5000 iterations: 0 ms, Speed: 30717.6 MiB/s
xxHash128 1024 bytes * 5000 iterations: 0 ms, Speed: 29533.2 MiB/s
MurmurHash3 1024 bytes * 5000 iterations: 1 ms, Speed: 3319.9 MiB/s
MD5 10240 bytes * 500 iterations: 8 ms, Speed: 565.2 MiB/s
SHA1 10240 bytes * 500 iterations: 5 ms, Speed: 944.3 MiB/s
SHA256 10240 bytes * 500 iterations: 14 ms, Speed: 336.6 MiB/s
xxHash64 10240 bytes * 500 iterations: 0 ms, Speed: 33178.9 MiB/s
xxHash128 10240 bytes * 500 iterations: 0 ms, Speed: 32247.5 MiB/s
MurmurHash3 10240 bytes * 500 iterations: 1 ms, Speed: 3714.1 MiB/s
MD5 10 bytes * 100000 iterations: 11 ms, Speed: 85.6 MiB/s
SHA1 10 bytes * 100000 iterations: 7 ms, Speed: 127.7 MiB/s
SHA256 10 bytes * 100000 iterations: 18 ms, Speed: 52.2 MiB/s
xxHash64 10 bytes * 100000 iterations: 0 ms, Speed: 4630.4 MiB/s
xxHash128 10 bytes * 100000 iterations: 0 ms, Speed: 2886.6 MiB/s
MurmurHash3 10 bytes * 100000 iterations: 1 ms, Speed: 730.6 MiB/s

It looks that theoretically it could be 29704.1 / 168.39 = 176 times faster! One tiny problem... It cannot be on Dart.

Rust benchmark ```rust use md5; use sha1::{Sha1, Digest as Sha1Digest}; use sha2::{Sha256, Digest as Sha2Digest}; use xxhash_rust::xxh3::{xxh3_64, xxh3_128}; use murmur3::murmur3_x64_128; use rand::Rng; use std::time::{Duration, Instant}; use hex::encode as hex_encode; use std::io::Cursor; fn main() { // Buffer sizes and iterations let data_sizes = [ (100 * 1024 * 1024, 1), // 100 MiB * 1 (5 * 1024 * 1024, 10), // 5 MiB * 10 (1 * 1024, 5000), // 1 KiB * 5000 (1 * 10240, 500), // 10 KiB * 500 (10, 100000), // 10 bytes * 100000 ]; for (data_size, iterations) in &data_sizes { let mut rng = rand::thread_rng(); let byte_list: Vec = (0..*data_size).map(|_| rng.gen()).collect(); // Benchmark MD5 let md5_duration = benchmark_md5(&byte_list, *iterations); print_benchmark_results("MD5", *data_size, *iterations, md5_duration); // Benchmark SHA1 let sha1_duration = benchmark_sha1(&byte_list, *iterations); print_benchmark_results("SHA1", *data_size, *iterations, sha1_duration); // Benchmark SHA256 let sha256_duration = benchmark_sha256(&byte_list, *iterations); print_benchmark_results("SHA256", *data_size, *iterations, sha256_duration); // Benchmark xxHash 64-bit let xxhash64_duration = benchmark_xxhash64(&byte_list, *iterations); print_benchmark_results("xxHash64", *data_size, *iterations, xxhash64_duration); // Benchmark xxHash 128-bit let xxhash128_duration = benchmark_xxhash128(&byte_list, *iterations); print_benchmark_results("xxHash128", *data_size, *iterations, xxhash128_duration); // Benchmark MurmurHash3 let murmur_duration = benchmark_murmur3(&byte_list, *iterations); print_benchmark_results("MurmurHash3", *data_size, *iterations, murmur_duration); } } fn print_benchmark_results(hash_name: &str, data_size: usize, iterations: usize, duration: Duration) { let total_size = data_size * iterations; let speed = total_size as f64 / (duration.as_secs_f64() * 1024.0 * 1024.0); println!( "{} {} bytes * {} iterations: {} ms, Speed: {:.1} MiB/s", hash_name, data_size, iterations, duration.as_millis(), speed ); } fn benchmark_md5(bytes: &[u8], iterations: usize) -> Duration { let start = Instant::now(); for _ in 0..iterations { let _result = md5::compute(bytes); } start.elapsed() } fn benchmark_sha1(bytes: &[u8], iterations: usize) -> Duration { let start = Instant::now(); for _ in 0..iterations { let mut hasher = Sha1::new(); hasher.update(bytes); let _result = hasher.finalize(); } start.elapsed() } fn benchmark_sha256(bytes: &[u8], iterations: usize) -> Duration { let start = Instant::now(); for _ in 0..iterations { let mut hasher = Sha256::new(); hasher.update(bytes); let _result = hasher.finalize(); } start.elapsed() } fn benchmark_xxhash64(bytes: &[u8], iterations: usize) -> Duration { let start = Instant::now(); for _ in 0..iterations { let _result = xxh3_64(bytes); } start.elapsed() } fn benchmark_xxhash128(bytes: &[u8], iterations: usize) -> Duration { let start = Instant::now(); for _ in 0..iterations { let _result = xxh3_128(bytes); } start.elapsed() } fn benchmark_murmur3(bytes: &[u8], iterations: usize) -> Duration { let start = Instant::now(); for _ in 0..iterations { let _result = murmur3_x64_128(&mut Cursor::new(bytes), 0).unwrap(); } start.elapsed() } ```