RustCrypto / stream-ciphers

Collection of stream cipher algorithms
255 stars 49 forks source link

salsa20: revert sse2 #346

Closed baloo closed 6 months ago

baloo commented 6 months ago

This reverts #328.

The changes introduced here generate failure when used in scrypt. The scrypt_block_mix would generate a different value.

I'm not able to figure out why that change breaks scrypt. Reverting until we can figure out why.

baloo commented 6 months ago

cc @oxarbitrage Sorry for the set back. I'd love your help on fixing this. This triggered on https://github.com/RustCrypto/password-hashes/commit/527596328f38fc78ee3194986752a50857bb334c see https://github.com/RustCrypto/password-hashes/pull/489 for some more context.

tarcieri commented 6 months ago

Agreed we should remove this for now.

Perhaps it would be possible to add proptests between the two implementations to ensure they function equivalently.

Turning the inputs/outputs from the linked failures into a regression test would be good as well.

baloo commented 6 months ago

https://github.com/baloo/salsa20-inconsistency I exported the issue in a separate repository, I need to drill down and make a smaller case.

baloo commented 6 months ago

Last commit adds a regression test, the output of it on master:

---- salsa20_regression_2024_03 stdout ----
thread 'salsa20_regression_2024_03' panicked at salsa20/tests/mod.rs:196:5:
assertion `left == right` failed
  left: [67, 18, 156, 164, 195, 118, 115, 157, 115, 48, 192, 60, 107, 174, 210, 224, 204, 177, 66, 148, 67, 31, 250, 186, 106, 40, 72, 195, 139, 250, 227, 103, 56, 53, 99, 150, 30, 74, 89, 28, 86, 189, 143, 207, 201, 216, 201, 199, 229, 123, 19, 0, 18, 57, 218, 21, 184, 43, 70, 154, 245, 13, 48, 192]
 right: [102, 163, 212, 163, 47, 134, 235, 142, 174, 254, 90, 162, 92, 181, 255, 26, 172, 145, 23, 125, 208, 63, 17, 73, 121, 208, 66, 241, 86, 88, 165, 5, 3, 91, 144, 209, 85, 159, 29, 208, 194, 206, 175, 48, 20, 18, 151, 41, 253, 214, 151, 207, 148, 209, 97, 22, 88, 139, 39, 28, 208, 61, 155, 66]
oxarbitrage commented 6 months ago

Thank you for tagging me and for the regression test to expose the issue, I can take a look in about 24 hours and let you know if i can make a fix.

oxarbitrage commented 6 months ago

I see the reverse was already merged, that gives me some more time. I was not able to find the problem in a quick lookup however i will find out. Ill keep you posted here.

baloo commented 6 months ago

We needed to merge the revert because that was blocking the dependencies. But this isn't going to be released as stable quite yet. You have some time :)

oxarbitrage commented 6 months ago

We needed to merge the revert because that was blocking the dependencies. But this isn't going to be released as stable quite yet. You have some time :)

Yea, that is fine and understandable, ill figure out whats going on there. Thanks you again.

oxarbitrage commented 6 months ago

It seems that the SSE2 setup i did will only work for salsa20 with 10 double rounds (salsa20/20). The scrypt uses the salsa20/8 version.

A quick workaround is to call the software backend when we are in the SSE2 mode and using other than salsa20/20. This call can be around here, something like:

    if !is_salsa20(&backend) {
        f.call(&mut crate::backends::soft::Backend(&mut SalsaCore::<R> {
            state: *state,
            rounds: PhantomData,
        }));
    }
    else {
        f.call(&mut backend);
        state[8] = _mm_cvtsi128_si32(backend.v[2]) as u32;
    }

A function to know if we are in salsa20/20 can be coded as:

fn is_salsa20(t: &dyn Any) -> bool {
    core::any::TypeId::of::<Backend<cipher::typenum::U10>>() == t.type_id()
}

This is far from ideal as we loss SSE2 optimization for salsa20/8. I can try to make that work but it will take more time or, i can push a PR adding back SSE2, the quick fix for versions that are not salsa20/20 and documenting that the SSE2 optimization will only be used in salsa20/20.

I am also open to other alternatives, let me know.

tarcieri commented 6 months ago

You should be able to do:

if R::USIZE == 10

...which should allow the compiler to remove the branch as it can be determined at compile-time.

Otherwise I think falling back to the soft implementation for non-Salsa20/20 sounds fine for now.

Of course it would be really nice to get it working for Salsa20/8 for scrypt, since it's bottlenecked on that.

oxarbitrage commented 6 months ago

Otherwise I think falling back to the soft implementation for non-Salsa20/20 sounds fine for now.

Thanks. Ill do this for now and ill work in a possible implementation of the sse2 optimization for salsa20/8 with some more time. Will push a PR with this soon.