Closed easyaspi314 closed 4 years ago
See discussion in #361.
There are a few options.
The first and simplest option which has no changes to the behavior is this:
XXH_PUBLIC_API XXH_errorcode
XXH3_copyState(XXH3_state_t* dst, const XXH3_state_t* src)
{
XXH_memcpy(dst, src, sizeof(XXH3_state_t));
if (src->secret == &src->customSecret[0]) {
dst->secret = &dst->customSecret[0];
}
return XXH_OK;
}
This is a little hacky though.
The second option is to use one of the reserved fields to tell which kind of state it is, and use that instead of comparing raw pointers.
While we are doing this, we could add error checking to prevent mixing 64-bit and 128-bit states. This would change the behavior, but that usage was not intended, anyways and can be considered UB.
The third option is to use a NULL
pointer for seeded mode, and if it is NULL, use customSecret
instead of secret
, but this would add a branch. However, there is already enough overhead from state management that it would likely make an unnoticeable difference.
The last option is to malloc()
a copy of the secret pointer.
This will change the ABI, as it is no longer possible to run it entirely on stack memory. It will also be highly redundant for kSecret
.
However, it will allow custom secrets to be invalidated after reset_withSeed()
, and allow us to shrink the size of the state.
Oops, accidentally hit the wrong button :facepalm:
I think there is a fourth option: Always copy the entropy pool into the structure; this would force the pool to be the size of the default secret though. That may be kind of nice though and it could form the basis for the offline entropy pool generation optimization.
uint64_t XXH3_64bits_withEntropyPool(XXH3_state_t &pool, uint8_t data, size_t size);
The implementation would "extend" the XXH3 state with the given data; immediately returning the hash without modifying the state…
EDIT This would preserve movability; which IMO is a great boon to usability and would be highly appreciated for a rust integration.
uint64_t XXH3_64bits_withEntropyPool(XXH3_state_t &pool, uint8_t data, size_t size);
error: expected ')'
...XXH3_64bits_withEntropyPool(XXH3_state_t &pool, uint8_t data,...
^
note: to match this '('
uint64_t XXH3_64bits_withEntropyPool(XXH3_state_t &pool, uint8_t...
^
1 error generated.
What is this? I don't know what you are trying to do there… :smirk:
Movability would be a nice feature, and would be easy with a fixed size secret.
Actually, everything would be easier (and faster) with a fixed size secret, but we don't want to mess up the API.
I think the best option is the NULL
pointer.
There are already 5 branches at the start of XXH3_update
. It doesn't hurt to do one more (especially since it only is needed when the buffer is filled)
It is also to be noted that x86_64 and i686 can do this without branching:
lea tmp, [state + offsetof (state::customSecret)]
test secret, secret
cmove secret, tmp
This is also branchless on ARM
cmp secret, #0
it eq
addeq secret, state, offsetof(state::customSecret)
and on AArch64:
add tmp, state, offsetof(state::customSecret)
cmp secret, #0
csel secret, tmp, secret, eq
Credit to @koraa.
Currently, this code exhibits a use after free issue:
This is due to
a->secret
pointing toa->customSecret
, andXXH3_copyState
does not update that pointer (it is literally justmemcpy
), sob->secret
points toa->customSecret
.After
a
is freed,b->secret
is now an invalid pointer.