uchicago-library / attachment-converter

Attachment Converter: tool for batch converting attachments in an email mailbox
GNU General Public License v2.0
8 stars 3 forks source link

Update cram tests for PDF conversions and images #85

Closed bufordrat closed 6 months ago

bufordrat commented 1 year ago

There are two small outstanding issues with our cram test suite, which lead them to produce output whose diffs are not human-readable. We would like to tweak them a little so that:

Two quick tweaks to our cram test suite should get us there.

Fix issues with PDF cram tests

It has recently come to our attention that our converted PDF-As, regardless of whether we use LibreOffice or pdf2archive, contain gensyms that make the data in each PDF-A unique.

To get around this, we propose updating the cram tests for the PDF-A conversions to use pdf2text to produce "comparable" output that wouldn't confuse a cram test diff.

Fix issues with image cram tests

Putting binary image data directly into the .t files means that the diff will display weird, non-printable characters, which is not ideal for eyeballing purposes.

To get around that problem, cram tests for image conversions should base64 encode the image output and insert the base64 data into the .t file.