ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.
We have issues with ePADD running out of memory when importing large Mbox files (a few GB) (We allocate 11GB to the application by using java –Xmx11g -jar epadd-standalone.jar). Mbox files are currently read in batches of 10,000 messages. Reducing this number to 100 resolves the problem for all Mbox files in our collection (up to 15GB for each file) without a noticeable performance impact. Would reducing the batch size in an ePADD release be an option?
We have issues with ePADD running out of memory when importing large Mbox files (a few GB) (We allocate 11GB to the application by using java –Xmx11g -jar epadd-standalone.jar). Mbox files are currently read in batches of 10,000 messages. Reducing this number to 100 resolves the problem for all Mbox files in our collection (up to 15GB for each file) without a noticeable performance impact. Would reducing the batch size in an ePADD release be an option?