benoitc / erlang-idna

Erlang IDNA lib
MIT License
43 stars 29 forks source link

erlang-idna

A pure Erlang IDNA implementation that follow the RFC5891.

Usage

idna:encode/{1,2} and idna:decode/{1, 2} functions are used to encode or decode an Internationalized Domain Names using IDNA protocol.

Input can be mapped to unicode using uts46 by setting the uts46 flag to true (default is false). If transition from IDNA 2003 to IDNA 2008 is needed, the flag transitional can be set to true, (default is false). If conformance to STD3 is needed, the flag std3_rules can be set to true. (default is false).

example:

1> idna:encode("日本語。JP", [uts46]).
"xn--wgv71a119e.xn--jp-"
2> idna:encode("日本語.JP", [uts46]).
"xn--wgv71a119e.xn--jp-"
...

Legacy support of IDNA 2003 is also available with to_ascii and to_unicode functions:

1> Domain = "www.詹姆斯.com".
[119,119,119,46,35449,22982,26031,46,99,111,109]
2> Encoded =  idna:to_ascii("www.詹姆斯.com").
"www.xn--8ws00zhy3a.com"
3> idna:to_unicode(Encoded).
[119,119,119,46,35449,22982,26031,46,99,111,109]

Update Unicode data

wget -O test/IdnaTestV2.txt https://www.unicode.org/Public/idna/latest/IdnaTestV2.txt wget -O uc_spec/ArabicShaping.txt https://www.unicode.org/Public/UNIDATA/ArabicShaping.txt wget -O uc_spec/IdnaMappingTable.txt https://www.unicode.org/Public/idna/latest/IdnaMappingTable.txt wget -O uc_spec/Scripts.txt https://www.unicode.org/Public/UNIDATA/Scripts.txt wget -O uc_spec/UnicodeData.txt https://www.unicode.org/Public/UNIDATA/UnicodeData.txt

git clone https://github.com/kjd/idna.git ./idna/tools/idna-data make-table --version 13.0.0 > uc_spec/idna-table.txt

cd uc_spec ./gen_idnadata_mod.escript ./gen_idna_table_mod.escript ./gen_idna_mapping_mod.escript