mailgun / talon

Apache License 2.0
1.27k stars 285 forks source link

Use regex match to detect outlook 2007, 2010, 2013 #169

Closed Savageman closed 5 years ago

Savageman commented 6 years ago

I encountered a variant of the outlook quotations with a space after the semicolon.

To prevent multiplying the number of rules, I implemented a regex match instead (I found how to here: https://stackoverflow.com/a/34093801/211204).

I documented all the different variants as cleanly as I could.

mailgun-ci commented 6 years ago

Can one of the admins verify this patch?

obukhov-sergey commented 5 years ago

@mailgun-ci test this please

obukhov-sergey commented 5 years ago

@Savageman awesome job! sorry for delay with the merge, can you fix the test plz:

....F....................................................................................
======================================================================
FAIL: tests.html_quotations_test.test_windows_mail_reply
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/lib/jenkins/shiningpanda/jobs/b20cb595/virtualenvs/d41d8cd9/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/var/lib/jenkins/workspace/lib-py-talon-pr/tests/html_quotations_test.py", line 351, in test_windows_mail_reply
    extract_reply_and_check("tests/fixtures/html_replies/windows_mail.html")
  File "/var/lib/jenkins/workspace/lib-py-talon-pr/tests/html_quotations_test.py", line 319, in extract_reply_and_check
    RE_WHITESPACE.sub('', plain_reply))
AssertionError: 'Hi.Iamfine.Thanks,Alex' != u'Hi.Iamfine.Thanks,Alex\u041e\u0442:AlexanderL(mailto:abc@example.com)\u041e\u0442\u043f\u0440\u0430\u0432\u043b\u0435\u043d\u043e:\xa0\u200e\u0447\u0435\u0442\u0432\u0435\u0440\u0433\u200e,\u200e26\u200e\u200e\u0438\u044e\u043d\u044f\u200e\u200e2014\u200e\u0433.\u200e15\u200e:\u200e05\u041a\u043e\u043c\u0443:Alex(mailto:alex-ninja@example.com)Hello!Howareyou?Thanks,Sasha.'

It's failing because you forgot "|" between outlook and windows mail formats.

obukhov-sergey commented 5 years ago

@mailgun-ci test this please