Open glassfishrobot opened 10 years ago
Reported by glam
yaroska said: I know about the difference. But why 'new lines' is the case?
glam said: Well, in our case we needed not to have new lines. The behaviour implemented by IndendtingUTF8XmlOutput, and not implemented elsewhere.
For example, we have a unit-test that fails because of that - we process the document multiple times, sometimes by writers, sometimes by output streams, and the outputs don't match.
Was assigned to yaroska
This issue was imported from java.net JIRA JAXB-985
We discovered a difference in the behaviour of the marshaller when using a Writer and when using an OutputStream.
In our case, we needed to marshall (formatted):
This worked when using an OutputStream, but new lines were inserted when using a Writer.
On investigation, I noticed that in com.sun.xml.bind.v2.runtime.MarshallerImpl has two distinct methods called createWriter(..) - one taking a Writer, and one taking an OutputStream. The version taking an OutputStream is doing something special in case the encoding is UTF-8 (Why??). The difference is in the usage of IndentingUTF8XmlOutput, which correctly implements the desired functionality - whenever there's text content before the element, no new line+indentation is appended.
However, OutputStream+UTF-8 is the only case *UTF8XmlOutput classes are used, otherwise they aren't. The indentation behaviour might be just one example of behaviour difference, so I would suggest to make the behaviour consistent (i.e. - work for both OutputStream and Writer implementations, and preferably for all encodings)
And while at it, in that class there is one check if (encoding.equals("UTF-8"))
{..} and one if (encoding.startsWith("UTF")) {..}
. This is pretty bad, as it doesn't work with lower-case utf-8 or without a dash, and also would yield different results for these combinations - for UTF8 one if-clause will work, the other won't. If different behaviour based on encoding is really needed (I would say it shouldn't be), then please use normalization/canonicalization of the string - via Charset.forName(..) for example.
Affected Versions
[2.2.7]