sul-dlss-deprecated / universalviewer

The Universal Viewer is a community-developed open source project on a mission to help you share your content with the world
http://universalviewer.io
Other
0 stars 1 forks source link

Add support for Microsoft Office documents #41

Open snydman opened 6 years ago

snydman commented 6 years ago

Similar to Box.com.

tomcrane commented 6 years ago

Is there a list of required formats? I imagine Word, Excel are mandatory, but what about the others?

https://en.wikipedia.org/wiki/Microsoft_Office#Desktop_apps

anarchivist commented 6 years ago

Speaking for the Virtual Tribunals project, we'd need support for Microsoft Word.

tomcrane commented 6 years ago

@snydman @anarchivist

The major attraction of box as the mechanism for rendering various document formats in the UV is that the development effort for the client is much simpler - the box viewer (https://github.com/box/viewer.js) could be incorporated into a UV extension, and will handle a huge number of formats. Without box, many independent solutions would be required for different formats, and some formats would be impossible.

However, the box viewer client library doesn’t deal with documents in their native formats; it renders documents that have already been transformed via the box API into an intermediate form. You instantiate a viewer and point it at the box version of the document (https://github.com/box/viewer.js#loading-a-simple-viewer). That requires an integration between the repository and box, so that the Word document/spreadsheet/etc is uploaded to box and converted. That is, the simple box implementation in the UV requires your assets to be on box first.

I've just been taking a look at how Confluence (the wiki we use) renders uploaded office docs in the browser. It converts them to PDF on the server and then renders the PDF (in the same way the UV currently renders a PDF).

Other lines of attack:

Office web viewer in a browser: https://blogs.office.com/en-us/2013/04/10/office-web-viewer-view-office-documents-in-a-browser/?eu=true (in an iFrame maybe)?

Google docs viewer: https://jsfiddle.net/7xr419yb/embedded/result/ (from https://stackoverflow.com/questions/27957766/how-do-i-render-a-word-document-doc-docx-in-the-browser-using-javascript)

Viewer.js (a different one) - works for PDFs and Open Document Format, but not MS Office docx etc: http://viewerjs.org/examples/

anarchivist commented 6 years ago

Hi @tomcrane - I think that's fine; my biggest concern is whether we'll need to transfer the files themselves to Box.com, rather than "just" hit an API with the files. Sending files to an API alone is not a major concern for my project, since all the assets are currently public; I defer to @snydman and others whether this would be a concern otherwise for any resources that might need to be behind auth.

snydman commented 6 years ago

Reading @tomcrane 's post a second time it seems that our docs would need to be on Box, which seems like a non-starter. I am leaning strongly towards the "convert Office docs to PDF" approach and make the native Office docs available for download in the viewer. Treat PDF as a derivative format generated during pre-accessioning, like JP2 creation.

tomcrane commented 6 years ago

Getting the UV to render Office docs from Box would probably be quite easy, but as you say a non-starter. I think that focusing effort on a really good user experience for PDFs, for which there is established web practice and JavaScript client libraries, would be a much better use of development time.

pinging @edsilv