golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.47k stars 17.59k forks source link

encoding/base64: integer overflow in (*Encoding).EncodedLen #20235

Open ekiru opened 7 years ago

ekiru commented 7 years ago

What version of Go are you using (go version)?

1.8.1

What operating system and processor architecture are you using (go env)?

linux/386

What did you do?

I called base64.RawStdEncoding.EncodedLen() with 1<<30 (2 to the 30th power) on a 32-bit platform, as in line 9 of https://play.golang.org/p/mV9110MxJ- , and base64.StdEncoding.EncodedLen() with 1<<31 / 4 * 3 (equal to 1,610,612,736) as in line 10 of the same example.

Line 10 also calls the method with the same argument on base64.StdEncoding for comparison.

What did you expect to see?

I expected EncodedLen() on RawStdEncoding to return a value close to that returned by StdEncoding for 1<<30I think 1,431,655,766 is the correct value, but I may have miscalculated) and EncodedLen() on StdEncoding to not return a negative encoded length for 1<<31 / 4 * 3 (perhaps to panic? I'm not certain what the appropriate behavior here is).

What did you see instead?

Instead the first example returned 0 and the second -2147483648. Neither of these values are sufficient to hold a base64 encoding of a string of the supplied lengths.

aead commented 7 years ago

This is (obviously) caused by an int overflow. Following EncodeLen function would make the behavior more consistent:

if enc.padChar == NoPadding {
      return int((int64(n)*8 + 5) / 6) // minimum # chars at 6 bits per char
}
return int((int64(n) + 2) / 3 * 4) // minimum # 4-char quanta, 3 bytes each

The question is: What does a negative length value mean / Should EncodeLen return a negative value?

bradfitz commented 7 years ago

This has always been like this, right? Moving to Go 1.10 unless this is a regression.

At some point things are going to overflow. It's not obvious what we should do. Panic?

aead commented 7 years ago

Yes, more or less since Go1 - so I think this should probably be moved to Go1.10 ... As you said, it's not obvious how we should handle overflowing properly. If EncodedLen() expected a slice the solution above would solve the overflow problem but a user can pass any values so doc / panic / doc+panic? I'm not sure - furthermore encoding/base32 does basically the same...

iwdgo commented 6 years ago

Both base32 and base64 have the same behavior with too large values for EncodingLen. This behavior is in line with RFC 4648 which states

  1. Security Considerations When base encoding and decoding is implemented, care should be taken not to introduce vulnerabilities to buffer overflow attacks, or other attacks on the implementation. A decoder should not break on invalid input including, e.g., embedded NUL characters (ASCII 0).

The EncodedLen function in discussion is used by the encoder itself directly. It means that panic would occur in the encoder too and would not be in line with the RFC. Returning an error would have a similar effect.

The current behavior allows to test if the encoder will overflow by using len(src) as documentation states.