dcposch / scramble

Secure email for everyone
http://dcposch.github.io/scramble/
226 stars 32 forks source link

Create our own unique ID for each email #59

Closed dcposch closed 10 years ago

dcposch commented 10 years ago

I think we should just use SHA1(entire SMTP DATA section) as the email ID, instead of using the Message-ID header.

After forwarding all my Gmail mail to Scramble for two days, it turns out that not all senders use good UUIDs for the Message-ID, and that it is not always unique :/

jaekwon commented 10 years ago

That might break thread compatibility with other mail systems. we can always generate a new ID if we detect a duplicate, as a fallback mechanism.

dcposch commented 10 years ago

The MessageID would still be used for threading. I meant a new id field that we use to uniquely identify emails, no change to the threading logic

DC

On Mon, Nov 4, 2013 at 3:26 AM, jaekwon notifications@github.com wrote:

That might break thread compatibility with other mail systems. we can always generate a new ID if we detect a duplicate, as a fallback mechanism.

— Reply to this email directly or view it on GitHubhttps://github.com/dcposch/scramble/issues/59#issuecomment-27678757 .

jaekwon commented 10 years ago

OK, makes sense. The box table should be updated (message_id -> email_id ) with a new foreign key constraint on email_id.

On Mon, Nov 4, 2013 at 2:33 PM, DC notifications@github.com wrote:

The MessageID would still be used for threading. I meant a new id field that we use to uniquely identify emails, no change to the threading logic

DC

On Mon, Nov 4, 2013 at 3:26 AM, jaekwon notifications@github.com wrote:

That might break thread compatibility with other mail systems. we can always generate a new ID if we detect a duplicate, as a fallback mechanism.

— Reply to this email directly or view it on GitHub< https://github.com/dcposch/scramble/issues/59#issuecomment-27678757> .

— Reply to this email directly or view it on GitHubhttps://github.com/dcposch/scramble/issues/59#issuecomment-27729674 .

jaekwon commented 10 years ago

Can you show me an email from a non-scramble server that has a duplicate message-id? All the ones I see on the logs are from scramble, probably due to people hitting the "send" button multiple times. (this is the desired behavior on the server, though we should improve the UI to prevent sending the same message twice).

If there are actual messages in the wild with duplicate message-ids, i am wondering if the content is actually unique.

On Mon, Nov 4, 2013 at 3:38 PM, Jae Kwon jkwon.work@gmail.com wrote:

OK, makes sense. The box table should be updated (message_id -> email_id ) with a new foreign key constraint on email_id.

On Mon, Nov 4, 2013 at 2:33 PM, DC notifications@github.com wrote:

The MessageID would still be used for threading. I meant a new id field that we use to uniquely identify emails, no change to the threading logic

DC

On Mon, Nov 4, 2013 at 3:26 AM, jaekwon notifications@github.com wrote:

That might break thread compatibility with other mail systems. we can always generate a new ID if we detect a duplicate, as a fallback mechanism.

— Reply to this email directly or view it on GitHub< https://github.com/dcposch/scramble/issues/59#issuecomment-27678757> .

— Reply to this email directly or view it on GitHubhttps://github.com/dcposch/scramble/issues/59#issuecomment-27729674 .

dcposch commented 10 years ago

Yeah, you're right. We can punt on having a true unique ID for now because Message-ID is in fact unique whenever it's specified as far as I can tell given all the mail Scramble.io has received so far.

There are a few ways we could receive the same email (identical Message-ID, identical From, To, Subject, Body) multiple times:

dcposch commented 10 years ago

I've closed the issue for now because we handle such duplicates gracefully---as long as the other fields (From, To, Subject, and the body) also match, it will simply be ignored.