KillingSpark / zstd-rs

zstd-decoder in pure rust
MIT License
253 stars 34 forks source link

Move twox-hash dependency to dev-dependencies #52

Closed tamird closed 1 year ago

tamird commented 1 year ago

Looks like it is only used for corpus testing. It'd be good to make the hasher optional - either through a type parameter or using a dyn box. Both options will require a good bit of plumbing.

KillingSpark commented 1 year ago

Looks like it is only used for corpus testing

No, the hash is part of the actual frame format as defined by the zstd project. This is not an optional thing, it should be done for any frame that contains a checksum

KillingSpark commented 1 year ago

I mean technically checking the checksum is optional but I would like to do this check if the checksum is included in the frame. You could make that optional but I think first there should be a benchmark showing that the hash is actually a significant cost compared to the decoding of the frame content

tamird commented 1 year ago

I think I must be missing something - the present implementation doesn't check the hash, does it? Some tests do, but do users of this library know they're supposed to do it?

KillingSpark commented 1 year ago

I mean it's part of the public API. Do you think that the docs should point this out more clearly? I thought that the people that cared about that stuff would know that zstd supports this feature and would look for it in the docs

tamird commented 1 year ago

I think I'm missing something - everyone is paying the cost of doing the hashing, even if the result is never read. Shouldn't we make that optional? We could put it behind a feature that's enabled by default.

KillingSpark commented 1 year ago

everyone is paying the cost of doing the hashing

No you're not missing anything this is true. The hashing is very fast though compared to the decoding of the compressed data which is why I never bothered. I don't think it even showed up in the profiling when I was looking for bottlenecks.

Which doesn't mean this isn't still an unnecessary cost for the people that don't care about it. Hiding it behind a feature seems reasonable. No need for a big plumbing refactor, just a few codeblocks that need to get behind a #[cfg]

tamird commented 1 year ago

Sounds good. I'll update #53 to do that. Thanks!

tamird commented 1 year ago

Updated.

tamird commented 1 year ago

I think we're missing some test coverage for the new feature. Something like https://github.com/gimli-rs/gimli/commit/19e48f899c7e7df2afd136ef31fae342f613f912 would be good.

KillingSpark commented 1 year ago

Yep I'm aware. I was trying to figure out the right CI config to automatically test all combinations of features but the github CI yaml is pretty unintuitive for me. I might just add them manually as in the commit you referenced

Edit: nevermind should have taken a closer look. This seems nifty, I'll take closer look at this!

tamird commented 1 year ago

The commit I linked uses cargo-hack to test all combinations, I think that's the way to go.