ePADD / epadd

ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.
https://www.epaddproject.org
112 stars 24 forks source link

Addressbook computation algorithm change: Spec #338

Closed chinuhub closed 5 years ago

chinuhub commented 5 years ago

Provide a set of email addresses as trusted one.

  1. When processing a message: If sender is a trusted mail address- Call process contact with all name-email pairs (appearing anywhere in this message’s header) as trusted ones. if sender is not a trusted email address- Call process contact with all name-email pairs (appearing anywhere in this message’s header) as non-trusted ones.

If non-trusted name-email pair then don’t do unification based on name (but the unification based on email is still done). After that, just add the name in the contact.

If trusted name-email pair then perform the unification as done earlier.

Case 1.

Suppose we have two name-email pairs appearing in the archive. Fikes notification@linkedin.com and Moore notification@linkedin.in

Earlier these two were being unified because of the common mail id. As a result, these two different names were being put in the same contact. It resulted in those names being in the same contact with proper names of Fikes like Richard Fikes Richard L. Fikes Fikes fikes@cs.stanford.edu.in

After the algorithm change, now there will be a new contact with email id as notification@linkedin.com and names of Fikes, Moore etc. being together in that contact.

Case 2 [Not addressed in modified algorithm] if someone sends a mail to

Dummy fikes@cs.stanford.edu and fikes@cs.stanford.edu is a trusted email address:

If this mail was sent from a trusted name-email pair: Then this association is a trusted one and unification will be based on ‘dummy’ as well as on email address. If this mail was sent from a non-trusted name-email pair: Then this association is non-trusted so no unification will take place based on "dummy" but unification will take place based on email. It results in "dummy" being added to the contact where fikes@cs.stanford.edu is a valid email address.

peterchanws commented 5 years ago

Using Bush small and jeb@jeb.org initially, I got Jeb Bush listed in the correspondent list. However, after adding jeb@bush-brogan-2002 as trusted email address, Jeb Bush disappear from the correspondent list. Will continue testing with Fikes.

peterchanws commented 5 years ago

please see Chinmay's specifications

chinuhub commented 5 years ago

For every correspondent, there will be 3 columns in total.

Messages sent, Messages received, Messages received from the owner

Message sent- number of messages where this correspondent is as a sender. Messages received- number of messages where this correspondent is as a receiver (to,cc,bcc) Messages received from the owner - If owner's mail id is given then the number of messages sent by owner's addresses where this correspondent appears as a receiver.

peterchanws commented 5 years ago

ver Jan 16 The 3 columns should apply to processing, discovery and delivery module as well as appraisal. Done - Jan 17

peterchanws commented 5 years ago

Add "Archive owner email address" under "More". Although we can add this information in Edit Correspondent, our system (via browser) is not good at editing a big address book.

ADDED - Jan 20 version

peterchanws commented 5 years ago

System somehow treats some email as correspondent name. Example:

brankel@eog.state.fl.us Laura Branker Branker, Laura laura.branker@myflorida.com

peterchanws commented 5 years ago

It seems "Received messages" count messages "To" the designated correspondent/address. It should include "cc/bcc" as well.

peterchanws commented 5 years ago

Need to update Advance Search: From: Messages Direction: Incoming; Outgoing; Either To: Messages Sender: Archiver Owner _____

peterchanws commented 5 years ago

Need to update "Information about this archive" from

Messages: 123 Incoming: 234 Outgoing: 345

Images: 234 Documents: 345 Others:456

to Messages: 123 Images: 234 Documents: 345 Others:456

screen shot 2019-01-17 at 6 15 01 pm

hangal commented 5 years ago

Should we have 2 fields?

Total Messages Messages sent by archive owner

On Fri, Jan 18, 2019 at 9:08 AM Peter Chan notifications@github.com wrote:

Need to update "Information about this archive" from Messages: 123 Incoming: 234 Outgoing: 345

Images: 234 Documents: 345 Others:456

to Messages: 123 Images: 234 Documents: 345 Others:456

[image: screen shot 2019-01-17 at 6 15 01 pm] https://user-images.githubusercontent.com/1050899/51364034-72317480-1a8f-11e9-9926-b73b8e9de590.png

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/ePADD/epadd/issues/338#issuecomment-455417123, or mute the thread https://github.com/notifications/unsubscribe-auth/AA-JP-aQsWKthCTXP1eJQ2IXG0512Y1Uks5vEUGwgaJpZM4ZG_PS .

peterchanws commented 5 years ago

Or Messages: 123 Sent by owner: 234 Received by owner: 345

peterchanws commented 5 years ago

Archive: Bush small

When I change archive owner from Jeb Bush to Laura Banker, message counts didn't reflect such change: Owner Laura Banker screen shot 2019-01-17 at 10 16 00 pm Owner Jeb Bush screen shot 2019-01-17 at 10 17 16 pm

chinuhub commented 5 years ago

Did you make sure that the same mail id did not appear in more than on contact?

peterchanws commented 5 years ago

@chinuhub Sorry. After further verification, message counts after changes in email owner are correct.

peterchanws commented 5 years ago

feature added