Closed GoogleCodeExporter closed 9 years ago
Please check that the test meets your expectations.
http://code.google.com/p/snakeyaml/source/browse/src/test/java/org/yaml/snakeyam
l/issues/issue155/BinaryTest.java
Original comment by py4fun@gmail.com
on 1 Sep 2012 at 7:47
A few questions are on the way...
1) According to http://en.wikipedia.org/wiki/ISO/IEC_8859-1 the 0x99 is unused
and has no visual representation. (See the attachment). I could not find a way
to detect such a character in the Java API (java.lang.Character). If such a
character can be detected in a String, it can be dumped as binary data
2) I am afraid, if we can somehow detect a 'strange' character to make it
binary, the problem will not be solved anyway. This is because a String will
become a binary (!!binary) and it will be read as byte[] (which is not what you
want). The YAML document cannot anyhow indicate the encoding for the binary
data which makes it impossible for SnakeYAML to create characters out of bytes.
3) Feel free to take the source and experiment with the tests.
You can change SafeRepresenter.BINARY_PATTERN to make the test for issue 155
green.
This is an example:
public static Pattern BINARY_PATTERN =
Pattern.compile("[\\x00-\\x08\\x0B\\x0C\\x0E-\\x1F\u0085\u00A0-\uD7FF\uE000-\uFF
FD]");
Any feedback is welcome.
Original comment by py4fun@gmail.com
on 2 Sep 2012 at 8:33
Attachments:
Hi, sorry I didn't get back sooner, I was away all weekend.
The test looks right, yes.
For my use case I think binary would work. Betamax 'records' HTTP traffic the
first time a request is made & 'plays it back' subsequent times; sending the
original byte[] would be the right thing to do as it would guarantee the
'recorded' response is the same as the real one. That's not to say it's the
right thing to do in the general case, though.
Original comment by robert.w...@gmail.com
on 3 Sep 2012 at 8:16
How are you going to detect whether the parsed data is a String or a byte[] ?
Do you mean that you wish to ask at runtime the class of the returned object ?
Please be aware that the binary representation will be completely unreadable
and uneditable by humans. Why do you need YAML then ?
In general, YAML is not supposed to transfer binary data. The very same byte
sequence for one person means binary abracadabra, while for another person it
may mean a beautiful German or Russian character.
As I already said, you can change the source to make the test work (and build
the version that works for you). But it breaks expectations of other users (as
you can clearly see when you run all the tests).
If we cannot find a general solution for everyone, then I would rather keep the
things as they are, because it is more consistent.
And of course, SnakeYAML must always eat its own food. There must be no
exception when SnakeYAML loads its own output. We still have to find a solution
for it.
Original comment by py4fun@gmail.com
on 3 Sep 2012 at 8:48
The library already supports binary HTTP response data so if invalid strings
were binary encoded it should work OK. You're right that would mean they aren't
user editable but it's quite an edge case to get into this situation in the
first place.
Original comment by robert.w...@gmail.com
on 3 Sep 2012 at 4:12
[deleted comment]
The issue should be fixed now. Try the latest source or the SNAPSHOT.
http://code.google.com/p/snakeyaml/source/detail?r=a03784312f5beb3031fd8a08b47c1
6c6bff1f404
Please give it a try.
This change has also affected issue 137 . I think, it has become more
consistent. Hopefully the reporter for issue 137 can give some feedback for
this change.
Original comment by py4fun@gmail.com
on 5 Sep 2012 at 8:09
Interestingly I'm finding that if I set my HTTP response body to be the
incorrectly decoded String then SnakeYAML 1.11-SNAPSHOT will dump it as binary.
When I read it back the special character doesn't match but it no longer falls
over.
If I set the HTTP response body as bytes then SnakeYAML is dumping it as a
string (probably because I'm doing some stuff in my Representer implementation).
I need to do some more investigation as I'm also getting inconsistent results
between running tests from the terminal using gradle and running them within
IntelliJ IDEA, presumably there's a different default charset I'm not
accounting for.
Original comment by robert.w...@gmail.com
on 5 Sep 2012 at 8:47
If you think, there is still something to do for SnakeYAML, please let us know.
Otherwise, we can close the issue and release version 1.11
Original comment by py4fun@gmail.com
on 5 Sep 2012 at 3:10
I think this can be closed. SnakeYAML is doing the right thing.
Original comment by robert.w...@gmail.com
on 5 Sep 2012 at 3:16
The fix will be provided in version 1.11
Original comment by py4fun@gmail.com
on 5 Sep 2012 at 5:05
Original issue reported on code.google.com by
robert.w...@gmail.com
on 31 Aug 2012 at 4:37