mailgun / talon

Apache License 2.0
1.27k stars 285 forks source link

HTML Quote extraction appears to not be working #140

Open Mikejonesab12 opened 7 years ago

Mikejonesab12 commented 7 years ago

I performed the demos of both the regular text extraction and the HTML extraction found on the README. The text extraction worked as expected. However, the HTML extraction simply returned the original input.

I am using Python 3.6.1.

Any ideas?

obukhov-sergey commented 7 years ago

@Mikejonesab12 I couldn't reproduce, could you provide a code snippet?

Mikejonesab12 commented 7 years ago

The code which is almost exactly from the README:

import talon
from talon import quotations

talon.init()

html = """Reply
<blockquote>

  <div>
    On 11-Apr-2011, at 6:54 PM, Bob &lt;bob@example.com&gt; wrote:
  </div>

  <div>
    Quote
  </div>

</blockquote>"""

#reply = quotations.extract_from(html, 'text/html')
reply = quotations.extract_from_html(html)
print(reply)

Printed output:

Reply
<blockquote>

  <div>
    On 11-Apr-2011, at 6:54 PM, Bob &lt;bob@example.com&gt; wrote:
  </div>

  <div>
    Quote
  </div>

</blockquote>
dimirc commented 7 years ago

I have same problem, not working HTML quote (gmail_quote)