playframework / play-ws

Standalone Play WS, an async HTTP client with fluent API
https://www.playframework.com/documentation/latest/JavaWS
Apache License 2.0
223 stars 86 forks source link

WSRequest.post(Document) set incorrect encoding in the XML Prolog #164

Open guofengzh opened 7 years ago

guofengzh commented 7 years ago

I use JAXB to marshal a Java Object to Document, then use WSRequest.post() to send the documet to a server. When marshaling, I set the encoding to EUC_KR: ... Document document = db.newDocument(); marshaller.setProperty(Marshaller.JAXB_ENCODING, "EUC_KR"); marshaller.marshal(obj, document);

The POST request is: ws.url(registerProductUrl) .addHeader("Content-Type", "text/xml;charset=euc-kr") .post(document);

This works well in Play 2.5.14, where the prolog in the request is <?xml version="1.0" encoding="EUC-KR"?>

When upgrade to play 2.6.0, the prolog in the request changed to: <?xml version="1.0" encoding="UTF-8"?> that is, the encoding is now not "EUC-KR.

I do not find where WSRequest.post(Document) is implemented, but I find that play.libs.ws.XML.toBytes(Document document) has bug. It does not set the encoding when transform a XML document.

The following project can produce the issue: https://github.com/guofengzh/play-xml-prolog-issue

see JaxbText.java in test/controller. JaxbText.XMLtoBytesTest() use XML.toBytes() to reproduce the issue. The prolog it generated is: <?xml version="1.0" encoding="UTF-8"?> (the correct encoding should be "EUC-KR".

JaxbText.myToBytesTest() use the modified toBytes(), which set the encoding using: transformer.setOutputProperty(OutputKeys.ENCODING, encoding); and generate the correct encoding in the prolog: <?xml version="1.0" encoding="EUC-KR"?>

wsargent commented 7 years ago

I do not find where WSRequest.post(Document) is implemented, but I find that play.libs.ws.XML.toBytes(Document document) has bug. It does not set the encoding when transform a XML document.

That's not what you're calling though -- you're calling play.libs.XML:

import play.libs.XML;

https://github.com/guofengzh/play-xml-prolog-issue/blob/master/test/controllers/JaxbText.java#L10

wsargent commented 7 years ago

There is a change between Play 2.5.x and Play 2.6.x in the XML.java, in that the document builder is slightly different:

https://github.com/playframework/playframework/blob/2.6.x/framework/src/play/src/main/java/play/libs/XML.java#L72

https://github.com/playframework/playframework/blob/2.5.x/framework/src/play/src/main/java/play/libs/XML.java#L68

as part of https://github.com/playframework/playframework/pull/6342 but otherwise it doesn't look like the code there has changed.

wsargent commented 7 years ago

@guofengzh can you provide a reproducible use case showing

Document document = db.newDocument();
marshaller.setProperty(Marshaller.JAXB_ENCODING, "EUC_KR");
marshaller.marshal(obj, document);

ws.url(registerProductUrl)
.addHeader("Content-Type", "text/xml;charset=euc-kr")
.post(document);

being used in a test?

wsargent commented 7 years ago

In particular, the code for toBytes:

 public static ByteString toBytes(Document document) {
        ByteStringBuilder builder = ByteString$.MODULE$.newBuilder();
        try {
            TransformerFactory.newInstance().newTransformer()
                .transform(new DOMSource(document), new StreamResult(builder.asOutputStream()));
        } catch (TransformerException e) {
            throw new RuntimeException(e);
        }
        return builder.result();
    }

is the same in play-ws: https://github.com/playframework/play-ws/blob/master/play-ws-standalone-xml/src/main/java/play/libs/ws/XML.java#L92

and in 2.5.x: https://github.com/playframework/playframework/blob/2.5.x/framework/src/play/src/main/java/play/libs/XML.java#L93

and in 2.6.x: https://github.com/playframework/playframework/blob/2.6.x/framework/src/play/src/main/java/play/libs/XML.java#L97

guofengzh commented 7 years ago

see postTest() in test/controller/JaxbText.java.

guofengzh commented 7 years ago

see postTest() in test/controller/JaxbText.java.