yskumar007 / snakeyaml

Automatically exported from code.google.com/p/snakeyaml
Apache License 2.0
0 stars 0 forks source link

ScannerException should not contain the invalid character it's forbidding - makes message hard to read #209

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Have a yaml file with tabs or other forbidden chars
2. See than an actual _tab_ is in the error message, e.g. "found character 
<tab> \t(TAB) that cannot start any token"
3. Same (worse) for backspace, vertical tab, etc

What is the expected output? What do you see instead?
I'd just rather have the escaped character ("chRepresentation") in the error 
message

What version of SnakeYAML are you using? On what Java version?
1.14

Here's a simple patch (to prod code and test)

Original issue reported on code.google.com by joseph.g...@gmail.com on 10 Apr 2015 at 7:52

Attachments:

GoogleCodeExporter commented 9 years ago
Can you please add tests to show how it affects 1) and 3) ?

Original comment by py4fun@gmail.com on 10 Apr 2015 at 9:37

GoogleCodeExporter commented 9 years ago
Hmmn yeah I'm not sure I should have written "steps" for this. Whether it's a 
tab or another unwanted control character the issue is the same. As far as I 
can tell \t is only special because we print "(TAB)" next to it. So sure, I 
could add a test for another such char. While I'm at it I could also argue that 
other chars could benefit from a textual description rather than just their 
escaped form. 

Original comment by joseph.g...@gmail.com on 10 Apr 2015 at 10:34

GoogleCodeExporter commented 9 years ago
>I could also argue that other chars could benefit from a textual description 
rather than just their escaped form. 
Well, we print the char AND its textual description
As far as I can see the proposal is:
Improve human readability: in case of a leading TAB, do not print it in the 
error message, its  textual description is enough.
Basically, you say leading TAB should not be printed but other use cases or 
characters are not affected.
It is a minor issue for me. Let us see if somebody else can say something to 
come to a conclusion.

Original comment by py4fun@gmail.com on 11 Apr 2015 at 5:12

GoogleCodeExporter commented 9 years ago
Ha, you're right. By reading the code quickly, I'd assumed that other special 
chars covered by org.yaml.snakeyaml.scanner.ScannerImpl#ESCAPE_REPLACEMENTS 
would get the same treatment (i.e print the char itself THEN its 
representation). After adding a quick test for \b or \f, it seems those are 
caught earlier in StreamReader.

Original comment by joseph.g...@gmail.com on 12 Apr 2015 at 3:35