NCEAS / morpho

Morpho metadata editor
GNU General Public License v2.0
3 stars 1 forks source link

Add rendered metadata as pdf file in Morpho export #1056

Closed mbjones closed 6 years ago

mbjones commented 6 years ago

Author Name: Matt Jones (Matt Jones) Original Redmine Issue: 6557, https://projects.ecoinformatics.org/ecoinfo/issues/6557 Original Date: 2014-05-21 Original Assignee: Lauren Walker


Morpho currently exports metadata both as xml and html. Users have requested the rendered metadata also be provided in PDF format. Add stylesheets for rendering as PDF and include this in the export file.

The stylesheets for generating PDF can probably be general enough to be included in the EML package for many to use, and then simply imported and used within Morpho.

See related issue #6053 in Metacat for delivering Bagit packages. Ideally, Morpho's export would produce a Bagit compatible zip file equivalent to what one gets from Metacat.

mbjones commented 6 years ago

Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2014-05-22T02:10:15Z


A couple comments/questions: Using Apache FOP would be nice since we are going from XML->PDF. There's a ton in the FO spec for laying out the document format and the sky is kind of the limit on how we want it to look. Do we want it to be exactly like the existing HTML metadata outputs? Should it skinnable with the ability to add header graphics, change fonts, etc?

mbjones commented 6 years ago

Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2014-05-22T22:16:17Z


Looked into FOP, but would take a lot of coding.

Looked at HTML -> PDF options and there is "flying saucer" that uses iText. I tried it out and it's very promising. We do need to edit the existing EML XSLTs to make better "printable" HTML before we convert to PDF, but this is more tractable than writing FO XSLTs from scratch.

To clean up our nasty non-XHTML: http://jtidy.sourceforge.net/howto.html

To generate the PDF from the tidy XHTML: https://today.java.net/pub/a/today/2007/06/26/generating-pdfs-with-flying-saucer-and-itext.html

mbjones commented 6 years ago

Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2014-05-22T22:17:30Z


Super simple code to do the transformation:

public void export(String inputFile, String outputFile) throws IOException, DocumentException {
        OutputStream os = new FileOutputStream(outputFile);

        String tidyFile = inputFile + ".tidy";
        OutputStream tidyOut = new FileOutputStream(tidyFile);

        Tidy tidy = new Tidy();
        tidy.setXHTML(true);
        tidy.parse(new FileInputStream(inputFile), tidyOut);

        String url = new File(tidyFile).toURI().toURL().toString();

        ITextRenderer renderer = new ITextRenderer();
        renderer.setDocument(url);
        renderer.layout();
        renderer.createPDF(os);
        os.close();

    }
mbjones commented 6 years ago

Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2014-05-23T15:29:53Z


I've added a class and corresponding test to the EML project that renders a sample EML file as both HTML and PDF using the default CSS. 'ant runonetest' will allow you to run it (HtmlToPdfTest is the default class to run). The output will be in build/tests/eml-sample.xml.html and build/tests/eml-sample.xml.pdf

mbjones commented 6 years ago

Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2014-05-23T15:33:02Z


Hoping Lauren can do a bit of work on the layout to make it narrow enough to fit on a page.

mbjones commented 6 years ago

Original Redmine Comment Author Name: Lauren Walker (Lauren Walker) Original Date: 2014-05-27T22:34:04Z


I styled the EML -> HTML output a bit to make it more modern and simple, and made sure that it converts to a PDF without running off the page.

mbjones commented 6 years ago

Original Redmine Comment Author Name: Lauren Walker (Lauren Walker) Original Date: 2014-05-27T22:44:32Z


Attached is a test EML->HTML->PDF that was generated using the ant runonetest HtmlToPdfTest

mbjones commented 6 years ago

Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2014-11-12T17:25:48Z


Moving to Morpho release for feature tracking even though it is implemented in utilities project.