rust-bitcoin / bitcoin_hashes

Simple library which implements a few hashes and does nothing else
Creative Commons Zero v1.0 Universal
66 stars 52 forks source link

FromHex seems overly restrictive #146

Closed Kixunil closed 1 year ago

Kixunil commented 2 years ago

I just noticed that FromHex requires DoubleEndedIterator and ExactSizeIterator. I think being reversed should be dealt with in the function by reversing the destination, not source and don't see how exact size is required. Same information can be obtained from size_hint and I don't see an advantage of requiring it in type if we can't rely on it to optimize anyway.

apoelstra commented 2 years ago

How can you "reverse the destination" of e.g. a fmt::Write or a streaming socket?

I think ExactSizeIterator is there so that we can check that the length of the input is even.

Kixunil commented 2 years ago

My understanding is that FromHex is only ever used in constructors of types which are backed by arrays. But frankly I didn't do an in-depth review, so maybe I'm wrong. And maybe it'd be useful to have two traits or a parametric trait.

What happens if the value from ExactSizeIterator doesn't match yhe number of actually returned items? There has to be a check of real length anyway (no UB allowed), so it seems to make sense to skip checking "theoretical" length.

apoelstra commented 2 years ago

Interestingly, if ExactSizeIterator lies, we currently do not detect this or do anything useful in response -- if the iterator ends early we will return a hash with zeroes in the missing places, and if it goes long we'll never notice. This seems perfectly fine to me -- if you implement a standard trait incorrectly you should expect broken (but safe) things to happen.

If we did not require ExactSizeIterator, this would be an API change where we no longer check that iterators have exactly the correct length; in the case of iterators with trailing bytes we would need to decide what to do; just ignore them? I'm not sure.

Looking into the code, I do think that, given the way we use FromHex that we could drop the DoubleEndedIterator bound and just reverse the data after it has been parsed into a hash. I guess the cost would be an extra nanosecond or two, maybe :). This is probably useful for users who are parsing binary data from a stream where they don't want to deal with extra caching.

Kixunil commented 2 years ago

will return a hash with zeroes in the missing places

This seems pretty bad to me in cryptographic code. At least debug assertion would be nice.

just ignore them? I'm not sure

Good question. In a crate I'm writing I have two methods for somewhat similar scenarios: one has _min suffix and ignores the remaining data, the one without suffix panics. Maybe this is the way to go. Also it's a bit less invasive change.

reverse the data after it has been parsed into a hash

Where would you need to do that? If you have an array/slice, you can copy in reverse: dst.iter().rev().zip(src).for_each(|(dst, src)| *dst = *src);

This is probably useful for users who are parsing binary data from a stream where they don't want to deal with extra caching.

Yes, that's a good example.

apoelstra commented 2 years ago

If you have an array/slice, you can copy in reverse: dst.iter().rev().zip(src).for_each(|(dst, src)| dst = src);

I'm not sure what you mean. We don't have an array or slice. We have an iterator.

This seems pretty bad to me in cryptographic code. At least debug assertion would be nice.

Agreed for sure.

Kixunil commented 2 years ago

self.0 is usually an array and FromHex is used to construct Self.

apoelstra commented 2 years ago

Neither of the methods in FromHex have a self parameter.

Kixunil commented 2 years ago

Sure, I meant data that will be stored there when the function returns.

apoelstra commented 1 year ago

Closing. This crate is moving to rust-bitcoin and we have a number of hex discussions over there.