emory-libraries / dlp-enhance

1 stars 0 forks source link

Display non-image formats #24

Open katfish2 opened 3 months ago

katfish2 commented 3 months ago

Rose Digital Archives would like EDC to be able to display a wider variety of non-image formats, like PDFs.

Because of current file-type limitations, most born-digital content is deemed ineligible for EDC ingest. Even though we could theoretically ingest any file type, it doesn't currently make sense to spend time and storage space on content that won't be visible and usable for end users.

In some cases, LTDS and ARS colleagues have used a workaround for born-digital publications that are acquired as PDFs: splitting the files into page-level TIFFs and ingesting those. However, this workaround takes more time than ingesting the PDFs would, uses extra storage space and energy, creates access copies that are slow to load in Lux, and means what we're preserving and presenting to end users is not the authentic original.

I would like to see a mechanism for user-friendly access to non-image file formats, either within the viewer (for common formats) or as a download link (for less common or more complex formats). This would allow for more effective ingest, preservation, and use of many EUA publications and other born-digital records that are created and received in formats other than page-level image files. It would support access to many new collections that are otherwise strong candidates for EDC, some existing EDC collections for which more recent additions are born-digital PDFs, and some collections currently housed in legacy systems that we would prefer not to maintain long term.

All internal and external users of content for which EDC is not currently optimized would benefit. Staff users would likely save time on content prep and ingest. Use of existing collections and content types would not be affected. Some careful messaging or help documentation might be required if the solution includes a download option in addition to or instead of in-browser display, since download-only works will look and function differently, but confusion could likely be mitigated with clear UI design.

eporter23 commented 3 months ago

Thanks for submitting this. We are actually going to investigate this as part of the OpenEmory project which is active right now, given that the vast majority of the content is PDF and we would like site users to have a more direct interaction with the content as opposed to simply seeing a thumbnail. As that progresses I will let you know, and hopefully if we implement this feature for the new self deposit application we can also bring it to Curate and Lux.

katfish2 commented 2 months ago

That's great to hear! I'll look out for updates as the Open Emory project proceeds.