uchicago-library / attachment-converter

Attachment Converter: tool for batch converting attachments in an email mailbox
GNU General Public License v2.0
8 stars 3 forks source link

Create (first version of) progress bar #52

Closed bufordrat closed 1 year ago

bufordrat commented 1 year ago

Create (first version of) progress bar

Task: create a progress bar for Attachment Converter.

Background

So out of the box, our initial version of Attachment Converter isn't exactly the fastest performing application in the world. Some of our initial tests took about a minute to convert all the emails in an mbox with a total of 10-15 attachments.

Our major bottleneck is caused by the fact that Attachment Converter calls out to external applications to perform its conversions. We haven't done much benchmarking, but one reasonable starting assumption is that LibreOffice, which we are currently using heavily to convert to PDF-A, takes a while to do each one.

There are a couple "low hanging fruit" tactics we can take to lessen the runtime of the application on mbox-es that of a realistic size:

We will continue to explore those options as we work on the project. That said, no matter how many of these approaches to speedings things up we end adopting, it seems pretty clear that at least for large mbox-es we will need to be prepared for a full round of attachment conversions to take a while.

Given the apparent inevitability of some amount of slowness, Attachment Converter will need to send some indication to the user of what is happening.

Progress Bar

Getting the progress bar to output useful information to the user while also outputting its actual data to standard out involves a little finessing of UNIX terminals and file handles.

Before we get into that, let's outline what information should be in the progress bar.

Layout

For this initial version, what we're calling a "progress bar" will just be a printed line of information with something like the following format:

converting <ATTACHMENT-FILENAME> to <ATTACHMENT-BASENAME>.<TARGET-EXTENSION> ...

It should print that line of information just before it begins each conversion, so that if that conversion takes five seconds, the user will see that line of information on the bottom of the screen for five seconds.

How to do it

Broadly, we want to:

That approach will display both the output data and the progress bar messages intermixed at the same time, in the terminal. Really, what we want is an either/or situation:

The following UNIX hijinks should give us that result:

The Stdlib.Unix module provides a pretty comprehensive interface to UNIX system calls. You can use Unix.isatty to check whether a device is a tty. Since this take a Unix file descriptor as an input (rather than an input channel), use Unix.stdout rather than Pervasives.stdout.