alecthomas / go_serialization_benchmarks

Benchmarks of Go serialization methods
https://alecthomas.github.io/go_serialization_benchmarks/
1.56k stars 158 forks source link

BENC's unsafe code #130

Closed ymz-ncnk closed 8 months ago

ymz-ncnk commented 8 months ago

BENC's unsafe code uses two interesting functions, b2s() and s2b(), whose documentation states "... Note that it may break if the string and/or fragment header is changed in future versions of go." It turns out that using bunsafe code, we won't be able to switch to a new Golang version in a case, for example, when a vulnerability is discovered in the current version.

If so, the usefulness of benchmarks for such code seems questionable to me, considering the fact that Golang developers provide a reliable method for unsafe string conversion.

deneonet commented 8 months ago

The code is from here, that is 4 years old and still the code works, so you are able to switch to a new Golang version. I don't think that the slice/string headers will change in the future.

deneonet commented 8 months ago

Also, as this website states: "The unsafe package allows developers to bypass Go's type safety and memory safety restrictions, enabling operations that are typically disallowed. While this can lead to more efficient code, it also increases the likelihood of bugs and vulnerabilities.". MUS uses the unsafe package, also allowing bugs and vulnerabilities, so is this not an argument against your own serializer?

alecthomas commented 8 months ago

As long as people are aware of the use of unsafe, I think it's fine. The slice/string headers haven't changed since the release of Go, and while they could, that seems like a vanishingly small concern. But again, as long as people are aware of it.

alecthomas commented 8 months ago

FTR Cap'nProto also uses unsafe casts (via this lib).

ymz-ncnk commented 8 months ago

this website gives 404

ymz-ncnk commented 8 months ago

Yes, MUS uses unsafe code, just like other projects, unsafe code can be found even in math.Float64bits(). But this is not the point. What I'm trying to say is that there is a correct way to do unsafe string conversion. And this is not just my opinion. Here you can read:

"...(string)(unsafe.Pointer(&mySlice)), which is never actually officially documented anywhere as something that can be relied upon. Under the hood, the shape of a string is less than a slice, so this seems valid per unsafe rule (1), but this is all relying on undocumented behavior. The second use case is commonly seen as ([]byte)(unsafe.Pointer(&string)), which is by-default broken because the Cap field can be past the end of a page boundary (example here, in widely used code) -- this violates unsafe rule (1)."

Also here you can find recomendation:

"After we deprecated reflect.{SliceHeader, StringHeader}, it is recommended to use unsafe.{Slice, String} to replace its work. ...".

The Golang specification also states:

"A package using unsafe must be vetted manually for type safety and may not be portable."

, i.e. a special care should be taken.

And with all this you say, that everything is ok with (string)(unsafe.Pointer(&mySlice)). Golang developers probably had nothing to do when they implemented unsafe.String(), unsafe.StringData() functions.

alecthomas commented 8 months ago

While I think it would be preferable to use the unsafe package functions, other encoders use the same thing as BENC (as I said, capnproto does exactly the same thing - *(*string)(unsafe.Pointer(&b))), so it's not an issue. If you want to suggest that BENC switch to the unsafe package, file a ticket against it.

alecthomas commented 8 months ago

Basically - please stop having this argument on this repo. If you want to discuss it, discuss it on the BENC repo.

ymz-ncnk commented 8 months ago

Ok, sorry. Not the right place.

ymz-ncnk commented 8 months ago

As long as people are aware of the use of unsafe, I think it's fine.

Maybe #122 should also be closed?