rs / zerolog

Zero Allocation JSON Logger
MIT License
10.39k stars 566 forks source link

Add base64 helper methods #613

Open tulir opened 10 months ago

tulir commented 10 months ago

I often find myself wanting to output bytes as base64 instead of hex because base64 is more compact, and the base64 cli command is easier to use than xxd. Having to write .Str("foo", base64.StdEncoding.EncodeToString(data)) isn't very nice though and it's not as efficient as encoding directly into the buffer.

This PR adds .Base64 and .Base64Custom helper methods. The former just takes bytes and appends with standard padded base64, while the latter additionally takes a base64.Encoding instance to allow all the other kinds (urlsafe and/or unpadded base64). I'm not completely sure if the Custom method is a good idea, so I can change that if needed. It might make sense to have an url-safe method instead, or just not have any other options than standard.

tulir commented 10 months ago

Just realized there's the whole CBOR part of zerolog and it's behind a build flag. I tried adding support to that side, but not sure if it's correct.

Apparently CBOR only has tags for standard padded base64 and unpadded url-safe base64, so maybe I should also drop the Custom method and just have standard and raw url methods

rs commented 10 months ago

I have an alternative proposal. As base64 does not make sense in cbor and we currently have a Bytes method with a default encoding to hex, what about not exposing any new method but have a global parameter for the logger allowing the selection of the byte array encoding for JSON? The default would remain hex for backward compatibility but could be set to base64 or base64url.

I don't really see a need for multiple types of encoding in structured logs in the same app and this would avoid cluttering the API.

tulir commented 10 months ago

Hmm yeah that makes sense, will add that instead

tulir commented 10 months ago

So I did that (#615), but while writing the test I realized that the default implementation actually checks that the input is valid unicode and replaces invalid runes with \ufffd, rather than just hex-encoding any bytes outside the printable ASCII range. Everything works just fine, it just feels a bit weird to add a config option that changes the behavior so much (it's not just a hex -> base64 switch, because the default does more than just hex). There's also .Hex() which is just plain hex encoding for all bytes, but configuring that would make even less sense because it's called Hex and not EncodedBytes or something

rs commented 10 months ago

I can't remember doing that and it is certainly not expected. We may just go back to a basic hex encoding.

tulir commented 10 months ago

Seems like .Bytes() is meant to mirror .Str() with a []byte parameter, for that purpose it makes sense that it only allows valid unicode. Maybe the best option would be to have a new .Base64() method (like .Hex()), but remove Base64Custom and instead have a global variable for specifying the encoding? Or something like .EncodedBytes() with a fully configurable marshaling function

rs commented 10 months ago

It feels like a bad and confusing choice I did originally. I wonder if anyone is actually using this purposefully as I can't think about a use-case.

tulir commented 10 months ago

I just noticed that I'm using it for stack traces in panic handlers like .Bytes(zerolog.ErrorStackFieldName, debug.Stack()). debug.Stack() returns a byte array containing text, so it's a convenient way to log the stack trace as text without string() duplicating the memory first. (I previously searched for Bytes(", which didn't find anything, so I thought I wasn't using that function at all)

Anyway, I don't have strong opinions on how the API should be designed (and I'm not too attached to using .Bytes() for stack traces, panic handling is not a hot path 😹). The .Bytes() config in #615 is functional, or I can update this PR with a global option and remove .Base64Custom (or do something else entirely), let me know which you prefer.

rs commented 10 months ago

I still prefer the parameter to setup the behavior or Bytes with a default value to HexString and other values like Hex (perhaps not useful) and Base64