Use the base64::BCRYPT alphabet

Zireael-N commented 4 years ago

Since this code actually works with ASCII characters, using an array of 256 bytes seems like an obvious optimization.

While accessing a random element of a hashmap is O(1) (best case), it needs to calculate a hash. That's not the case for an array, so it's more performant. It also occupies less memory.

Keats commented 4 years ago

Is there a significant perf difference? This is much harder to understand than a hashmap.

Zireael-N commented 4 years ago

It's 2.5x-3x faster on my dual-core 1.3 Ghz notebook:

test decode_bcrypt_base64_array      ... bench:         339 ns/iter (+/- 97)
test decode_bcrypt_base64_hashmap    ... bench:       1,050 ns/iter (+/- 223)
test encode_to_bcrypt_base64_array   ... bench:         362 ns/iter (+/- 71)
test encode_to_bcrypt_base64_hashmap ... bench:         829 ns/iter (+/- 182)

But on the scale of the entire algorithm, the difference is pretty miniscule.

Generating those array LUTs from hashmaps using build.rs should make this easier to understand, what do you think?

Benchmarking code:

#[bench]
fn decode_bcrypt_base64_hashmap(b: &mut test::Bencher) {
    b.iter(|| b64_original::decode("YETqZE6eb07wZEO"));
}

#[bench]
fn decode_bcrypt_base64_array(b: &mut test::Bencher) {
    b.iter(|| b64_array::decode("YETqZE6eb07wZEO"));
}

#[bench]
fn encode_to_bcrypt_base64_hashmap(b: &mut test::Bencher) {
    b.iter(|| b64_original::encode(b"hello world"));
}

#[bench]
fn encode_to_bcrypt_base64_array(b: &mut test::Bencher) {
    b.iter(|| b64_array::encode(b"hello world"));
}

Keats commented 4 years ago

I think a build.rs approach would be better to keep it readable. I'm wondering if we can define a custom https://docs.rs/base64/0.12.0/base64/ config and use that instead what the speed difference would be like.

Zireael-N commented 4 years ago

I looked into that a while ago, unfortunately they don't provide a way to define your own alphabet: https://docs.rs/base64/0.12.0/base64/enum.CharacterSet.html.

Zireael-N commented 4 years ago

Force-pushed an implementation with build.rs.

Keats commented 4 years ago

I've asked if the base64 crate would be willing to add the bcrypt alphabet

Keats commented 4 years ago

https://github.com/marshallpierce/rust-base64/issues/87#issuecomment-624166133 looks like it would be possible. If we can get rid of all base64 code from that crate, that would be a big +

Zireael-N commented 4 years ago

Are you working on that PR?

Keats commented 4 years ago

I might do it later this week but not now, feel free to do it if you have time.

Zireael-N commented 4 years ago

It's merged in, should be available in 0.12.1.

Zireael-N commented 4 years ago

Repurposed the PR to instead use that functionality.

Keats commented 4 years ago

I like that a lot, thanks @Zireael-N !

Keats / rust-bcrypt

Use the base64::BCRYPT alphabet #44