Open daurnimator opened 5 years ago
Several other languages use siphash-2-4 (or siphash-1-3), with randomized key on startup, as an attempt to avoid denial-of-service attacks (which is important for http servers).
refs:
I'd be happy with that. Zig even already has a SipHash implementation, so it should just be a matter of using it in the http.headers module. @cactus do you want to work on this?
@daurnimator Looking into it a bit, SipHash returns a u64
, but std.HashMap
expects a hash function to return a u32
.
Additionally, Headers.init
currently does not allow failure. I think that would need to change, as siphash needs a hash key (Siphash.hash
), which would likely best be generated from crypto.randomBytes
(which is an alias for std.os.getrandom
) or maybe one of the PRNGs (prngs need a seed, so randombytes would need to be called anyway?).
Since Siphash.hash
requires a hash key along with input, this seems to make its use in std.HashMap
a bit awkward, as std.HashMap
wants comptime hash: fn (key: K) u32
but SipHash.has
provides pub fn hash(key: []const u8, input: []const u8) T
. Not exactly sure how to make the two mesh cleanly... I have an idea but it seems a bit awkward -- basically a wrapper around SipHash that converts it to the required api signature.
Finally, std.HashMap
returns a type
, which makes it a bit confusing for me to figure out how to reference it's return type as a param for the headers struct. Would you just specify type
as the type?
I was reading the docs and it said that a fn that retuns a struct (anonymous struct), the struct will be named the same as the function. In that case, would something like this be recommended?
// this would return something with the right /signature/, even though we wouldn't use it...
const HeaderIndex = std.AutoHashMap([]const u8, HeaderIndexList);
pub const Headers = struct {
// the owned header field name is stored in the index as part of the key
allocator: *Allocator,
data: HeaderList,
index: HeaderIndex, // <-- using the signature
// this is a hypothetical wrapper around siphash to deal with the 3rd issue noted above
hasher: HeaderHasher,
const Self = @This();
pub fn init(allocator: *Allocator) !Self {
const hasher = try HeaderHasher.init(allocator);
// same signature as AutoHashMap would return?
const index_hashmap = std.HashMap(
[]const u8,
HeaderIndexList,
hasher.hash,
std.getAutoEqlFn([]const u8),
);
return Self{
.allocator = allocator,
.data = HeaderList.init(allocator),
.index = index_hashmap.init(allocator),
.hasher = hasher,
};
}
Any thoughts?
SipHash returns a u64, but std.HashMap expects a hash function to return a u32.
Just @truncate
the result.
Additionally,
Headers.init
currently does not allow failure. I think that would need to change, as siphash needs a hash key (Siphash.hash
), which would likely best be generated fromcrypto.randomBytes
(which is an alias forstd.os.getrandom
) or maybe one of the PRNGs (prngs need a seed, so randombytes would need to be called anyway?).
Have Headers.init
take a seed argument and then just pass it along to siphash.
You could add a second constructor Headers.initWithRandom
that can fail and calls std.os.getrandom
Finally,
std.HashMap
returns atype
, which makes it a bit confusing for me to figure out how to reference it's return type as a param for the headers struct. Would you just specifytype
as the type?
Have index_hashmap
at the top level. You should be able to get the hasher.hash
function via e.g. HeaderHasher.hash
instead.
Also have a look at open PR ziglang/zig#2797
SipHash returns a u64, but std.HashMap expects a hash function to return a u32.
Just
@truncate
the result.
Ah. I guess that works.
Additionally,
Headers.init
currently does not allow failure. I think that would need to change, as siphash needs a hash key (Siphash.hash
), which would likely best be generated fromcrypto.randomBytes
(which is an alias forstd.os.getrandom
) or maybe one of the PRNGs (prngs need a seed, so randombytes would need to be called anyway?).Have
Headers.init
take a seed argument and then just pass it along to siphash. You could add a second constructorHeaders.initWithRandom
that can fail and callsstd.os.getrandom
Ok.
Finally,
std.HashMap
returns atype
, which makes it a bit confusing for me to figure out how to reference it's return type as a param for the headers struct. Would you just specifytype
as the type?Have
index_hashmap
at the top level. You should be able to get thehasher.hash
function via e.g.HeaderHasher.hash
instead.
Not sure that will work, as the index_hashmap has to exist first or the type signature will be wrong due to the .hash()
method taking a self
argument in my case?
Indeed, I get this:
error: expected 2 arguments, found 3
I'll look at it further though.
Also have a look at open PR ziglang/zig#2797
I'm not very familiar with Wyhash and how well it avoids the DoS issues that SipHash attempts to avoid. Looks like the author there ran into the same issue with the seed, and just hardcoded for now. I'll definitely keep an eye on that PR though, thanks for pointing it out.
I'm not very familiar with Wyhash and how well it avoids the DoS issues that SipHash attempts to avoid. Looks like the author there ran into the same issue with the seed, and just hardcoded for now.
Wyhash source says nothing about DoS so I would not assume it is secure in this regard. My reasonning in this PR was that the usecase hashmaps have to be optimized for, ie. the most common, is small keys. And that applications needing specific hash functions will anyway use their own. In this case, we don't really care about the seed being hardcoded, as it's a best effort without strong guaranties.
However it can be debated wether we would instead want the default hash be a best effort for security.
With respect to hash tables, hash functions, and security, I am aware of Reini Urban's (@rurban) comments at the rurban/smhasher
repository, which may be relevant.
The http.headers type
HeaderIndex
currently usesAutoHashMap
, it should use a better hash function, possibly parametized.Originally posted by @daurnimator in #2263