closeio / quotequail

a library that identifies quoted text in email messages
MIT License
58 stars 23 forks source link

Use end instead of start #25

Open andreip opened 6 years ago

andreip commented 6 years ago

These changes don't modify the current functionality but, as mentioned in the commit message descriptions, it logically makes more sense to use end instead of start, since find_unwrap_start could return a start != end, and in those cases we'd like to start looking for headers from end+1 instead of from start+1. Furthermore, if the headers don't start at the first line, extract_headers will fail, so another argument to use end + 1.

Do you think I should add some "artificially" created tests for these updates? Like mentioned in the comments, some of them would look like:

---------- Forwarded
message ----------
From: Someone <noreply@example.com>
Date: Fri, Apr 26, 2013 at 8:13 PM
Subject: Weekend Spanish classes

and

On 2012-10-16 at 17:02 ,
Someone <someone@example.com> wrote:

Some quoted text

notice that in both cases the header line for reply/forward takes two lines so this makes end != start as returned by find_unwrap_start, but that shouldn't really happen in real emails since those look poorly formatted and email clients would probably fix this?