mycoboco / wcwidth.js

a javascript porting of C's wcwidth()
http://code.woong.org/wcwidth.js
Other
12 stars 2 forks source link

wcwidth.js: a javascript porting of C's wcwidth()

Stick to version 1.0.2 if need to support node.js <16.15.1.

wcwidth.js is a simple javascript porting of wcwidth() implemented in C by Markus Kuhn.

wcwidth() and its string version, wcswidth() are defined by IEEE Std 1002.1-2001, a.k.a. POSIX.1-2001, and return the number of columns used to represent a wide character and string on fixed-width output devices like terminals. Markus's implementation assumes wide characters to be encoded in ISO 10646, which is almost true for JavaScript; almost because JavaScript uses UCS-2 and has problems with surrogate pairs. wcwidth.js converts surrogate pairs to Unicode code points to handle them correctly.

Following the original implementation, this library defines the column width of an ISO 10646 character as follows:

A surrogate high or low value which constitutes no pair is considered to have a column width of 1 according to the behavior of widespread terminals.

See the documentation from the C implementation for details.

wcwidth.js is simple to use:

import wcwidth from 'wcwidth.js';

wcwidth('한글'); // 4
wcwidth('\0'); // 0; NUL
wcwidth('\t'); // 0; control characters

Use wcwidth.js@1.1.2 for CommonJS modules.

If you plan to replace NUL or control characters with, say, ??? before printing, use wcwidth.config() that returns a closure to run wcwidth with your configuration:

const mywidth = wcwidth.config({
  nul: 3,
  control: 3,
});

mywidth('\0\f'); // 6
mywidth('한\t글'); // 7

Setting these options to -1 gives a function that returns -1 for a string containing an instance of NUL or control characters:

const mywidth = wcwidth.config({
  nul: 0,
  control: -1,
});

mywidth('java\0script'); // 10
mywidth('java\tscript'); // -1

This is useful when detecting if a string has non-printable characters.

When necessary, you can add to String.prototype a wcwidth getter as follows:

Object.defineProperty(
  String.prototype,
  'wcwidth',
  {
    get() {
      return wcwidth(this.valueOf());
    },
  },
);
'한글'.wcwidth; // 4

JavaScript has no character type, thus meaningless to have two versions of wcwidth while POSIX does for C. wcwidth also accepts a code value obtained by charCodeAt():

wcwidth('한'); // prints 2
wcwidth('글'.charCodeAt(0)); // prints 2

INSTALL.md explains how to build and install the library. For the copyright issues, see the accompanying LICENSE.md file.

If you have a question or suggestion, do not hesitate to contact me via email (woong.jun at gmail.com) or web.