Since 2004 decode_entities() supports the merging of surrogate pairs. See http://rt.cpan.org/Ticket/Display.html?id=7785 . This means that for example �� will be decoded into a single code point. My understanding that this not covered in any spec.
I therefore propose to add a function decode_entities_strict() that does the same as decode_entities() but rejects surrogate pairs.
Since 2004
decode_entities()
supports the merging of surrogate pairs. See http://rt.cpan.org/Ticket/Display.html?id=7785 . This means that for example��
will be decoded into a single code point. My understanding that this not covered in any spec.I therefore propose to add a function
decode_entities_strict()
that does the same asdecode_entities()
but rejects surrogate pairs.Attached is a sample script that shows the effect- surrogate_pair.pl.txt