NdheerajNagar / ez-vcard

Automatically exported from code.google.com/p/ez-vcard
0 stars 0 forks source link

Quoted Umlaut 'ß' (Unicode U+00DF) in wrong encoding #12

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. Parse this String via Ezvcard.parse().first()

BEGIN:VCARD
FN;quoted-printable:Max Ma=DF
TEL;voice:+49123123
END:VCARD

2. Access FormattedName via VCard-Api 
vCard.getFormattedName().getValue()

What is the expected output?
Max Maß

What is the actual output?
Max Ma�

What version of ez-vcard are you using?
0.9.1

What version of Java are you using?
1.7.0_17

Original issue reported on code.google.com by f.gaff...@googlemail.com on 7 Jan 2014 at 12:31

GoogleCodeExporter commented 9 years ago
Also with Java 1.7.0_45

Original comment by f.gaff...@googlemail.com on 7 Jan 2014 at 2:24

GoogleCodeExporter commented 9 years ago
Adding a CHARSET parameter to the FN property should solve the issue:

BEGIN:VCARD
FN;quoted-printable;CHARSET=ISO-8859-1:Max Ma=DF
TEL;voice:+49123123
END:VCARD

This parameter tells the vCard parser what character set the quoted-printable 
value is in.

Original comment by mike.angstadt on 7 Jan 2014 at 9:11

GoogleCodeExporter commented 9 years ago
Thanks a lot Mike for this fast responding, it works!

I work with an API which don't set the charset-encoding.

It is maybe a nice Feature to be able to set the encoding manually in the 
Ezvcard.parse()-method.

What is the default encoding if there no CHARSET-parameter has been set, like 
in this case? ASCII?

Original comment by f.gaff...@googlemail.com on 8 Jan 2014 at 8:18

GoogleCodeExporter commented 9 years ago
It will first attempt to find the character encoding of the Reader object.  If 
the Reader object doesn't have one (for example, if you are reading from a 
String object), then it will use your system's default character encoding.

I'll work on a way to set the encoding manually.

Original comment by mike.angstadt on 8 Jan 2014 at 1:26

GoogleCodeExporter commented 9 years ago
The default encoding could be set (REST, UTF-8), but the implementation seems 
to be not very consistent. So the fault is caused by the API-provider...

But I think that will be a nice feature for the future. Thanks a lot for your 
working on this very nice Java lib!

Original comment by f.gaff...@googlemail.com on 8 Jan 2014 at 3:27

GoogleCodeExporter commented 9 years ago
Fix completed.  You can now call a method to set a default character set for 
decoding quoted-printable properties that do not have a CHARSET parameter.

Charset charset = Charset.forName("ISO-8859-1");
VCard vcard = 
Ezvcard.parse(...).defaultQuotedPrintableEncoding(charset).first();

Original comment by mike.angstadt on 16 Jan 2014 at 4:03