The problem is because we are using sprintf under the hood here. For example, the format code %-35s says to print a string and then pad with whitespace on the right to have fill fixed width of 35.
Unfortunately, sprintf documentation says:
Field widths and precisions of %s conversions are interpreted as bytes, not characters, as described in the C standard.
When using certain Unicode characters, the
spongebobsay
family of functions will incorrectly whitespace pad to form the beech bubble.To reproduce:
Created on 2019-02-03 by the reprex package (v0.2.1)
The problem is because we are using
sprintf
under the hood here. For example, the format code%-35s
says to print a string and then pad with whitespace on the right to have fill fixed width of 35.Unfortunately,
sprintf
documentation says:Which is also what is found in the POSIX standard.
This means that any UTF-8 character that is represented by more than 1 byte will have its width incorrectly counted by
sprintf
.We'll need an alternative way to pad strings with whitespace that counts characters rather than bytes, possibly a custom function.
(Note: for a good primer on character encodings, read Joel Spolsky's seminal article.)