Closed cuonglm closed 9 years ago
Another considered option for performance improvement can be using LC_ALL=C
. But I'm not sure about the input data so I decide to make it in future.
Also note that on GNU systems with UTF-8 locales, sort -u
does not report unique lines but the first from sequence of lines which sort the same:
$ printf '%b\n' '\U2460' '\U2461' | LC_ALL=en_US.utf8 sort -u
①
Revise _short_url, using bash builtin string substitution
I often use sed
, mostly because it's readable (esp. when using sed -r
when possible.)
Also note that on GNU systems with UTF-8 locales, sort -u does not report unique lines but the first from sequence of lines which sort the same:
Yah I know. Let's keep that simple, though ;)
Thanks a lot for your contribution, @Gnouc !
I often use sed, mostly because it's readable (esp. when using sed -r when possible.)
You should use undocumented sed -E
with GNU sed (which equivalent to sed -r
). -E
option works in BSD sed, too and are going to be standard in next POSIX.
Very valuable information :)