cessen / ropey

A utf8 text rope for manipulating and editing large texts.
MIT License
1.04k stars 46 forks source link

Typo in node size calculations #35

Closed Plisp closed 3 years ago

Plisp commented 3 years ago

Is it perhaps an error that this line is pub(crate) const MAX_BYTES: usize = TARGET_NODE_SIZE - 1 - (PTR_SIZE * 2); rather than pub(crate) const MAX_BYTES: usize = TARGET_NODE_SIZE - size_of::<usize>()*2; or pub(crate) const MAX_BYTES: usize = 1024 - 1 - size_of::<usize>() - (PTR_SIZE * 2); as you have already subtracted the arc counters in TARGET_NODE_SIZE, and the layout is actually [ smallvec enum tag ][ smallvec capacity ][ BackingArray ] except when the "union" feature is enabled.

I found that this allows the correct maximum of 1024-16 = 1008 bytes to be achieved on my 64-bit system.

Plisp commented 3 years ago

Actually for a target size of 1024 the calculation happens to be correct because of Node's enum tag. Up to you whether it's worth changing or documenting. I found it a bit confusing when reading.

cessen commented 3 years ago

Actually, revisiting this, I'm not sure why or if this is correct anymore. Part of what makes this tricky is that we're relying on the internal layout of the structs in other crates (and stdlib) which are not guaranteed to be stable. But also, I may have just gotten things wrong.

Either way, this definitely needs better documentation! I'll figure it out and document it before the next release.

Thanks!

cessen commented 3 years ago

Fixed in 4273df030961bb60473958e206ae2ddc0ec35bfb and 26db388807ccecc157825adf928634741c52ed1a.

Indeed, it turned out the constant calculations weren't correct, for a variety of reasons including not accounting properly for the how the memory alignment of various types interact.

In addition to fixing the issue, I also added a compile-time assert that ensures everything ends up with the right size. I also tested this on both x86-64 and wasm32 to make sure it works on platforms with different pointer/usize sizes.