A pure Erlang IDNA implementation that follow the RFC5891.
idna:encode/{1,2}
and idna:decode/{1, 2}
functions are used to encode or decode an Internationalized Domain
Names using IDNA protocol.
Input can be mapped to unicode using uts46
by setting the uts46
flag to true (default is false). If transition from IDNA 2003 to
IDNA 2008 is needed, the flag transitional
can be set to true
, (default
is false). If
conformance to STD3 is needed, the flag std3_rules
can be set to true. (default is false
).
example:
1> idna:encode("日本語。JP", [uts46]).
"xn--wgv71a119e.xn--jp-"
2> idna:encode("日本語.JP", [uts46]).
"xn--wgv71a119e.xn--jp-"
...
Legacy support of IDNA 2003 is also available with to_ascii
and to_unicode
functions:
1> Domain = "www.詹姆斯.com".
[119,119,119,46,35449,22982,26031,46,99,111,109]
2> Encoded = idna:to_ascii("www.詹姆斯.com").
"www.xn--8ws00zhy3a.com"
3> idna:to_unicode(Encoded).
[119,119,119,46,35449,22982,26031,46,99,111,109]
Update Unicode data
wget -O test/IdnaTestV2.txt https://www.unicode.org/Public/idna/latest/IdnaTestV2.txt wget -O uc_spec/ArabicShaping.txt https://www.unicode.org/Public/UNIDATA/ArabicShaping.txt wget -O uc_spec/IdnaMappingTable.txt https://www.unicode.org/Public/idna/latest/IdnaMappingTable.txt wget -O uc_spec/Scripts.txt https://www.unicode.org/Public/UNIDATA/Scripts.txt wget -O uc_spec/UnicodeData.txt https://www.unicode.org/Public/UNIDATA/UnicodeData.txt
git clone https://github.com/kjd/idna.git ./idna/tools/idna-data make-table --version 13.0.0 > uc_spec/idna-table.txt
cd uc_spec ./gen_idnadata_mod.escript ./gen_idna_table_mod.escript ./gen_idna_mapping_mod.escript