prowide / prowide-iso20022

Comprehensive business model and parser for all ISO 20022 messages
https://www.prowidesoftware.com
Apache License 2.0
142 stars 66 forks source link

Characters transformed into unicode decimal codes during initialization of MxSwiftMessage #24

Closed vajnaivik closed 3 years ago

vajnaivik commented 3 years ago

When initializing MxSwiftMessage using MxSeev04500102, MxSeev04800101 some characters get replaced by unicode decimal codes in the following format: &#1234 (and a ";")

Example: The field: MxSeev04800101 - ShrhldrIdDsclsrRspnCxlAdvc - IssrDsclsrReqRef - FinInstrmId - Desc In the string: new MxSwiftMessage(mxSeev04800101).getMessage(); The character "ö" gets replaced by &#246 (and a ";")

The same happens when trying to call MxSeev04800101.message()

vajnaivik commented 3 years ago

A workaround I found using org.apache.commons.text.StringEscapeUtils: new MxSwiftMessage(StringEscapeUtils.unescapeHtml4(mxSeev04500102.message()))

serii833 commented 3 years ago

I have same problem with non-latin chars.

PaymentInstruction37 pi = new PaymentInstruction37()
                .setCdtr(new PartyIdentification135()
                        .setNm("текст текст")                       
                );

will end up in

<Doc:PmtInf>
    <Doc:Cdtr>
        <Doc:Nm>&#1057;&#8218;&#1056;&#181;&#1056;&#1108;&#1057;&#1027;&#1057;&#8218; &#1057;&#8218;&#1056;&#181;&#1056;&#1108;&#1057;&#1027;&#1057;&#8218;</Doc:Nm>
    </Doc:Cdtr>
</Doc:PmtInf>
zubri commented 3 years ago

From what I gather encoding those characters above 0x7f is just fine. It is the same implementation as in the com.sun.xml.bind.marshaller.DumbEscapeHandler

Notice the XML is still valid and if you parse it back to an object the encoded characters will be properly decoded:

MxPain00100103 mx2 = MxPain00100103.parse(xml); assertEquals("текст текст", mx.getCstmrCdtTrfInitn().getPmtInf().get(0).getCdtr().getNm());

Will add a configuration option to switch between different implementations of escape handler.

zubri commented 3 years ago

I've added new API to customize the escape handler. It is available in release 9.1.7, already in GitHub and soon at Maven Central.

You can check how it is used in the test case: https://github.com/prowide/prowide-iso20022/blob/b7b0c8ab6090378af3f8fed8529f7918e52746b2/iso20022-core/src/test/java/com/prowidesoftware/issues/Issue24.java#L47

You basically provide a configuration to the message serialization method. Within the configuration you can overwrite the default escape handler. Where the MinimumEscapeHandler will do what you expect.