tafia / quick-protobuf

A rust implementation of protobuf parser
MIT License
452 stars 87 forks source link

Fix UB in read_packed_fixed() #240

Closed snproj closed 1 year ago

snproj commented 1 year ago

Problem: UB in read_packed_fixed()

Closes #215 (Thanks @saethlin for finding this)

Solution

Breaking fix: In packed fixed fields, replace Cow<'a, T> with a similar custom struct PackedFixed<'a, T>, which exposes alignment-safe reading methods.

Other fields will still use Cow<'a, T>.

PackedFixed

Like Cow, PackedFixed is an enum. It has the following variants:

Optimization?

There is also the fact that PackedFixed::Owned will always contain an aligned vector while PackedFixed::Borrowed contains a reference to bytes that might be unaligned, and so every read must go through read_unaligned().

In fact, the original names for the variants were PackedFixed::Aligned and PackedFixed::MaybeUnaligned instead of PackedFixed::Owned and PackedFixed::Borrowed.

From my own testing attempts, I can't really tell whether there are speed/compiler optimizations when reading from PackedFixed::Owned instead of PackedFixed::Borrowed. There does appear to be some rather big slowdowns (about 3x) of reads from PackedFixed::Borrowed depending on where that code is placed (a rather arbitrary factor), but I can't tell if this is an artifact of my own testing or not. However, methods to convert PackedFixed::Borrowed to PackedFixed::Owned are provided.

Iterators

T refers to the data type that is encoded by the raw bytes.

There are two iterators provided:

Returning References

Because read_unaligned() does a bitwise copy and returns a new value, the PackedFixed::Borrowed variant cannot return references to the data that is encoded by its referenced bytes. It might technically be possible to implement methods that do for the enum in general (either panic when called on PackedFixed::Borrowed variant or convert to PackedFixed::Owned first) but I thought this would be rather confusing.