ePADD / epadd

ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.
https://www.epaddproject.org
112 stars 24 forks source link

Store all headers in index when parsing MBOX file #413

Closed ajlouie closed 1 year ago

ajlouie commented 3 years ago

Requirement

Parse MBOX file into internal representation, retaining all headers (in an un-indexed, and un-queryable form) and multi-part bodies

User story

As an archivist, I need to retain all headers that are available from the original email in order to exploit them for provenance and/or legal functions. NARA's guide to significant properties of email to preserve: https://github.com/usnationalarchives/digital-preservation/blob/master/Email_Formats/NARA_PreservationActionPlan_Email_20210525.pdf

Acceptance Criteria

jfarwer commented 1 year ago

This is now implemented.