Charset is not considered when reading QUOTED-PRINTABLE

GoogleCodeExporter commented 9 years ago

1. Try to read this vCard:

BEGIN:VCARD
VERSION:2.1
N;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:Р=9FРµС=80РµС=80РІР°;Р=9B�
�=8EР±РѕРІС=8C;Р=94РјРёС=82С=80РёРµРІРЅР°;;
FN;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:Р=9BС=8EР±РѕРІС=8C 
Р=9FРµС=80РµС=80РІР°
TEL;PREF;CELL;VOICE:+380975492937
TEL;HOME:+380669313512
X-IRMC-LUID:1
END:VCARD

2. Read the name like this:
vcard.getFormattedName().getValue()

Expected output:
Should be Cyrillic string in UTF-8: 
Перерва;Любовь;Дмитриевна;;

Actual result:
charset-unaware string: ?�?�????� ?�??�??�??

Please, fix this and make it possible to read vCard data in different charsets

Original issue reported on code.google.com by andronix83 on 4 Mar 2013 at 10:05

GoogleCodeExporter commented 9 years ago

Original comment by mike.angstadt on 5 Mar 2013 at 11:58

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

Thanks for the report.  It looks like the values of the N and FN properties in 
your example are not correctly encoded in the quoted-printable format.  The 
quoted-printable format requires that all characters be ASCII characters, but 
your property values contain non-ASCII characters.  The correct encoding of 
"Перерва;Любовь;Дмитриевна;;" would be:

=D0=9F=D0=B5=D1=80=D0=B5=D1=80=D0=B2=D0=B0;=D0=9B=D1=8E=D0=B1=D0=BE=D0=B2=D1=8C;
=D0=94=D0=BC=D0=B8=D1=82=D1=80=D0=B8=D0=B5=D0=B2=D0=BD=D0=B0;;

Thanks,
Mike

Original comment by mike.angstadt on 9 Mar 2013 at 12:45

Changed state: Invalid

srikanthtalasila / ez-vcard

Charset is not considered when reading QUOTED-PRINTABLE #4