Closed mcschroeder closed 5 years ago
This is a good catch, thank you. Did your last name play into its discovery by chance? 😄
Clearing the sticky line in some other way does sound really nice, but I also don't know about portability. I'd rather not add a dependency on utf8-string
, but hand-rolling a function that does the same thing should be fairly easy using bitmasking, but it would introduce some performance overhead.
Are you interested in taking a stab at implementing a solution to this?
I was actually trying to add a fancy progress bar to my sticky log line, but when I discovered the problems I was having were due to character encoding I really decided to dig my teeth in... I guess I am particularly sensitive to issues of mishandling non-ASCII characters… 😉
I’ll take a stab at a solution, sure.
Take the following example:
At the end of the run, you would expect your terminal to look like this:
Yet what actually happens on an 80-width terminal is this:
If you change the umlaut
ö
in thelogSticky
to ano
everything works as expected. Likewise, if you replicate theö
a number of times that is less than or equal to half the width of your terminal.The reason for this weirdness are the functions to clear the sticky line used in
stickyImpl
in RIO.Prelude.Logger:The variable
sticky
here is aByteString
that holds the current sticky line. By usingB.length
to determine how many backspaces are needed to clear it, we overcount if there are non-ASCII characters in the string that are encoded with multiple bytes (likeö
under UTF-8 which encodes to\195\182
).I'm not sure what the best fix is here:
We could calculate the number of characters (as opposed to the number of bytes) using something like
Data.ByteString.UTF8.length
from theutf8-string
package. This is a potentially expensive operation I'm guessing, but if we store the count together with the sticky line, we would only need to do it once when the line is logged (as opposed to every time we need to clear it). Also, the sticky line is going to be maximum as long as the terminal, which is probably <100 characters in most cases.We could clear the sticky line in some other way, e.g. using the functions from
ansi-terminal
. I don't know how portable those are, but this might be the cleanest solution.We could leave it as-is and add a warning to the documentation of
logSticky
to caution against using it with non-ASCII strings.