Closed gagabu closed 3 years ago
That's a good point. We're correctly finding the message length in bytes, but then do character-based substring.
I think we need to change:
message = message.Substring(0, MaxLogEventSize);
to something like this:
let bytes = Encoding.UTF8.GetBytes("Your string with some interesting data").Take(MaxLogEventSize).ToArray();
message = Encoding.UTF8.GetString(bytes, 0, bytes.Length);
Would you mind trying this out, and if it works submitting a pull request?
I've made pull request https://github.com/Cimpress-MCP/serilog-sinks-awscloudwatch/pull/101
Sometimes I get exception from console output:
That is code that throws this exception:
If message contains non-ascii chars, then Encoding.UTF8.GetByteCount() > message.Length (for example: "паляниця" has 8 chars length, but in UTF8 encoding has 16 bytes in memory). Second parameter of
message.Substring
means chars not bytes and that is why we get exception. And we are losing whole batch.Test code to show it:
I've not found cheap and memory tolerant method, that can truncate string using bytes count, not chars. Maybe
message.Substring(0, MaxLogEventSize/2);
could be possible here or split message into two separate parts if we dont want to lost something important?