Closed let4be closed 9 years ago
Email threading is not easy, but needs to be dealt with
I strongly recommend not using the algorithm as described in your link. In step 5, Jamie Zawinski groups emails without reference headers by subject, which would again cause https://github.com/lavab/web/issues/552 and https://github.com/lavab/api/issues/129. He complains that Netscape 4.0 removed this step, which was exactly the right thing to do.
An alternative suggestion: If you use the algorithm, remove step 5. Use a disjoint-set data structure to group together everything which references each other, then sort by references order, by date or by incoming server time stamp. The time stamp for sorting the threads in the inbox could be the latest time an email in it was sent.
Clients which have their time messed up and send emails on 1.1.1970 would have their emails sorted to the end of the inbox, where they are likely never seen by the user. One could protect against this by using the incoming server time stamp instead.
I implemented my personal adaptation of that algorithm, I only need to tune it a little bit, as I accidentally introduced some bugs back then.
You mean the one in https://github.com/lavab/mailer/blob/master/handler/handler.go?
Fixed in b0bcf390f57401d17f0d21613802f7ad50ee019f
@mlooz I think that for now it's good enough - mailer doesn't use the "Date" field anyways, which might cause issues. We'll bring it up later, when we'll be less loaded.
Details: https://github.com/lavab/web/issues/594