finos / symphony-bdk-python

Symphony Python Bot Development Kit (BDK)
https://symphony-bdk-python.finos.org/
Apache License 2.0
31 stars 34 forks source link

Messages containing HTML entities such as   cause a MessageParserException #207

Closed jimbodunc closed 3 years ago

jimbodunc commented 3 years ago

Parsing any message containing HTML entities fails in the function get_text_content_from_message:

https://github.com/SymphonyPlatformSolutions/symphony-api-client-python/blob/a15462a88ea21bc95c5ca28d92f5069c1a0d7369/symphony/bdk/core/service/message/message_parser.py#L24-L28

This issue is that difusedxml won't accept non-XML entities by default.

This is a particular issue because tables copy/pasted from Excel into a symphony chat render empty cells as  

For example: image

Arrives as:

<div data-format="PresentationML" data-version="2.0" class="wysiwyg">
    <p>
        <table class="pasted-table">
            <thead>
                <tr>
                    <th>/bilat-barc-dev</th>
                    <th>Bid</th>
                    <th>Mid</th>
                    <th>Ask</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td>&nbsp;</td>
                    <td>1.114</td>
                    <td>1.893</td>
                    <td>0.291</td>
                </tr>
            </tbody>
        </table>
    </p>
</div>
jimbodunc commented 3 years ago

I noticed that the same table pasted into Symphony 1.5 does not contain &nbsp; so this might (also) be a platform bug.

<div data-format="PresentationML" data-version="2.0" class="wysiwyg">
    <p>
        <table class="pasted-table">
            <thead>
                <tr>
                    <th>/bilat-barc-dev</th>
                    <th>Bid</th>
                    <th>Mid</th>
                    <th>Ask</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td>&#160;</td>
                    <td>1.114</td>
                    <td>1.893</td>
                    <td>0.291</td>
                </tr>
                <tr>
                    <td>2</td>
                    <td>1.265</td>
                    <td>1.304</td>
                    <td>1.727</td>
                </tr>
            </tbody>
        </table>
    </p>
</div>
symphony-mariacristina commented 3 years ago

Hello @jimbodunc! thanks for highlighting this issue, your PR has been merged. Thanks for your contribution!

symphony-mariacristina commented 3 years ago

@jimbodunc we released a new BDK 2.0 version, your fix is now available on 2.0b5 version https://github.com/SymphonyPlatformSolutions/symphony-api-client-python/releases/tag/2.0b5

jimbodunc commented 3 years ago

@symphony-mariacristina Great, thanks!