rust-bitcoin / rust-bech32

Bech32 format encoding and decoding
MIT License
93 stars 49 forks source link

Use pass-by-copy for Hrp when possible #148

Closed tcharding closed 10 months ago

tcharding commented 11 months ago

We currently have some code that uses pass-by-reference and some that uses pass-by-copy.

Further investigation shows that when messing with iterators we have to have a reference to the Hrp so that the iterator has somewhere in memory to point to while iterating.

The iterator API is intentionally low level, we can make the higher level APIs more ergonomic by using pass-by-copy.

apoelstra commented 11 months ago

I feel like if it implements Copy then we ought to copy it. And if it's too big to pass by reference then we ought to remove Copy from it.

What threshold do you have in mind for copying by reference vs copy?

tcharding commented 11 months ago

I feel like if it implements Copy then we ought to copy it. And if it's too big to pass by reference then we ought to remove Copy from it.

I thought a struct was Copy if all its fields were Copy irrespective of how big it is. Isn't Copy more about how cheap/expensive it is to do than about how big it is? Hrp is just an array so its cheap to copy but we still don't want to do so in every function call because its big, right? (Definition of "big" below.)

What threshold do you have in mind for copying by reference vs copy?

I had 32 bytes in mind, based on discussion a few years ago about passing hashes by copy.

apoelstra commented 11 months ago

I thought a struct was Copy if all its fields were Copy irrespective of how big it is.

You need to explicitly derive Copy, and if you don't want people to copy your structure, you shouldn't implement it. If some fields are non-Copy then you can't derive it, but you're never obilgated to derive it.

32 bytes seems a little small, though I could believe that that's roughly where the performance starts to equal out.

In this case though I don't think the performance benefit (if there is one) outweighs the loss of ergonomics. It's not like anybody would ever be manipulating Hrps in a tight loop.

tcharding commented 11 months ago

I'm convinced. I'll change this to "remove pass-by-reference"

Kixunil commented 10 months ago

Regarding Copy, just forget that terrible name. What it should've been called is WillNotImplDrop. Then the decision is clear: if a type will never implement Drop it should be Copy.

apoelstra commented 10 months ago

@Kixunil it also shouldn't be Copy if it has state that we don't want the user to accidentally duplicate (e.g. iterators) or if it's massive (e.g. contains 20 fields which are all 100-byte arrays). In these cases the name Copy seems to give the right intuition.

Kixunil commented 10 months ago

state that we don't want the user to accidentally duplicate

That to me feels like kind of an abuse of the type system but since there's Clone it's not a big deal. And yeah, we should follow the convention.

or if it's massive

What is the justification for this? The compiler will produce exactly the same amount of memcpy calls with Copy or without it. I don't know of any std type or official guidance saying one should not impl Copy for big types.

apoelstra commented 10 months ago

The compiler will produce exactly the same amount of memcpy calls with Copy or without it. I

If Copy is missing then users will be forced to pass-by-reference in most cases.

Kixunil commented 10 months ago

Which will also produce memcpy calls whenever they need owned value inside the function.

I think this is not something for libraries to decide, it's better to have lints and such if any actual performance benefit can be identified.

apoelstra commented 10 months ago

You can't get an owned value from a borrow of a non-Copy type.

I think this is not something for libraries to decide,

I believe we're only making this decision for our own internal code. And for our public APIs we have to decide whether to take a value by reference or by value.

Kixunil commented 10 months ago

You can if it's Clone, which does memcpy underneath for types with derived Clone that could've been Copy.

I believe we're only making this decision for our own internal code.

I'd be fine with this.