quentez / talonjs

JavaScript port of the Talon email quote parsing library.
MIT License
15 stars 9 forks source link

Wrong part of message removed #9

Open quentez opened 7 years ago

quentez commented 7 years ago

msg_4vok0e msg_52774k

evanostroski commented 7 years ago

msg_8spziv

evanostroski commented 7 years ago

msg_90bufr

evanostroski commented 7 years ago

msg_9b6xgp

evanostroski commented 7 years ago

msg_9df3oc

evanostroski commented 7 years ago

msg_9os3va

evanostroski commented 7 years ago

msg_9omp0r

evanostroski commented 7 years ago

msg_9qd3t1

evanostroski commented 7 years ago

msg_9q5uqp

xdmnl commented 7 years ago

msg_9vmm0p

divmgl commented 7 years ago

I cannot view any of these messages (except for msg_9vmm0p) in HQ.

This issue is a little more complicated than it seems at first glance. With msg_9vmm0p, the customer is using a blockquote tag in the context of a normal email message. TalonJS removes the last blockquote tag it finds in its body, which means that if the customer supplies a blockquote tag in a normal email message, it will get stripped.

The issue then becomes how do we know when a blockquote tag should be removed? After looking at the W3 conventions for email threading, it's clear that blockquote tags that have a cite attribute should be considered as containing the body of a threaded email, but I can't find any examples of emails in Front where blockquote tags have a cite attribute.

When quoting a message during a reply/forward, it is recommended that the text be encapsulated with BLOCKQUOTE elements, with a CITE attribute identifying the message being quoted, and optionally a CLASS attribute defining default style information. BLOCKQUOTEs explicitly authored by the user should not have an CITE, or should have a CITE pointing to the current message, so that they can be distinguished from message excerpts.

With Gmail things are a little bit different. Gmail adds the gmail_quote class to blockquote tags and TalonJS is removing all instances of it that it finds, even if it's supplied by the user. Technically, we should look for gmail_extra classes, as it looks like threaded emails are contained inside of a node with the gmail_extra class.

cc @quentez

divmgl commented 7 years ago

Hey @evanostroski, when possible please grant me access to the conversations you've left on this thread

jboga commented 6 years ago

Front conversation from customer: cnv_a0ne9d so we can reply to him when it's fixed. Messages with error: msg_k92qq1, msg_kdc81l, msg_k9jb09

eramdam commented 6 years ago

msg_x90xoh image at the beginning of the message being thought as part of the quote

xdmnl commented 6 years ago

msg_1aqt8ll nested quotes