Open henare opened 11 years ago
See also https://github.com/mysociety/alaveteli/wiki/Improved-document-conversion Indexability by search engines was one reason originally for doing the conversion ourselves, but google does index PDFs so I'm not sure this should really be a deciding factor.
This all started with conversion to text - so the site search could find things in documents.
The "View as HTML" was added later on user demand. Mainly really I think for Word documents, where more people didn't have a good local viewer. It doesn't seem there is a pdf.js a-like for Word :( http://stackoverflow.com/questions/14144069/pdf-js-analog-for-word-documents
Anyway, if everyone's browsers / local viewers were good enough, there'd be no need for the "View as HTML" feature at all.
This is linked to a problem where a PDF table was made difficult to read by conversion to HTML (the user's preferred format), and the user expressed that the table should be properly converted.
pdf.js is pretty awesome but I wonder what the downsides of doing this might be? Compatibility? Mobile?