RESTful service receiving json to construct a PDF document to various conformance levels
Standard maven build.
to package the jar
file mvn clean package
to run the application execute jar -jar /path/to/jar/app.jar server /path/to/config.yml
java -jar target/ms-html-to-pdfa-1.0-SNAPSHOT.jar server src/main/properties/dev.yml
from the IDE run the uk.gov.dwp.pdfa.application.HtmlToPdfApplication
with program arguments server path/to/properties.yml
(eg. src/main/properties/dev.yml)
NOTE: this application accepts environment variables that will be picked up at runtime (this file is bundled into to container). If https configuration is needed a modified config.yml
must be mounted into the container with the appropriate keystore/truststore locations (see dropwizard documentation).
server:
applicationContextPath: ${SERVER_CONTEXT_PATH:-/}
applicationConnectors:
- type: ${SERVER_APP_CONNECTOR:-http}
port: ${SERVER_APP_PORT:-6677}
adminConnectors:
- type: ${SERVER_ADMIN_CONNECTOR:-http}
port: ${SERVER_ADMIN_PORT:-0}
requestLog:
type: ${SERVER_REQUEST_LOG_TYPE:-external}
A k6 script is included to satisfy a basic load test. By default, this will target the application running on localhost
, via the docker hostname host.docker.internal
. This can be altered by passing an optional TARGET_HOST
environment variable.
Ensure you have the service running, and execute the test as follows:
# Default target: host.docker.internal
docker run --rm -i --name loadtest \
-v $PWD:/k6 \
loadimpact/k6 run - < ./load-test/test.js
# Custom target (must be accessible from within the k6 container)
docker run --rm -i --name loadtest \
-e TARGET_HOST=some-target:8080 \
-v $PWD:/k6 \
loadimpact/k6 run - < ./load-test/test.js
# Change no. virtual users and duration
docker run --rm -i --name loadtest \
-v $PWD:/k6 \
loadimpact/k6 run --vus 20 --duration 5m - < ./load-test/test.js
Default configuration and criteria for satisfying performance thresholds are bundled in the test scripts themselves.
For configuring the tests in the CI pipeline, refer to the official GitLab documentation or underlying template source.
/generatePdf
POST endpoint receiving the information to build the pdf file
{
"colour_profile": "base64-encoded-file",
"font_map": {
"tahoma": "base64-encoded-file",
"arial": "base64-encoded-file"
},
"page_html": "base64-encoded-html",
"conformance_level": "PDFA_1_A"
}
colour_profile
(optional) : The base64 encoded colour profile file contents to be embedded to the pdf. If this value is omitted or null the default colour profile will be applied (src/main/resources/colours/sRBG.icm)font_map
(optional): a list of fonts to be embedded into the pdf. If the font_map
is missing or null then a 2 default fonts will be embedded into the document.
arial
to cover basic fonts and courier
to cover monospace requirements..ttf
file contents to be embedded with the filepage_html
(mandatory): The base64 encoded html documentconformance_level
(optional): The conformance level for the resulting pdf.Pdf conformance levels are detailed here with acceptable values for this service as:-
PDF_UA
(https://en.wikipedia.org/wiki/PDF/UA)PDFA_1_A
PDFA_1_B
PDFA_2_A
PDFA_2_B
PDFA_3_A
PDFA_3_B
PDFA_3_U
NONE
The only mandatory parameter is the base64 encoded html. If only the html is passed a standard colour profile will be used, arial
(standard) and courier
(monospace) will be embedded to the pdf and the conformance level for the pdf will be PDF/UA
Returns:-
For the incoming html there are 2 things to consider.
<STYLE>
element to the <HEAD>
of the html and to apply it for all items (eg body). The important point is to make sure that all fonts are explicitly specified in the html document.<img src="https://github.com/dwp/ms-html-to-pdfa/raw/master/data:image/png;base64,<the-base64-encoded-string-of-the-image>"/>
eg.
<html>
<head>
<style>
pre, code, var {
font-family: 'courier', serif;
}
body {
font-family: 'arial', serif;
}
</style>
</head>
<body>
<h1>hello world</h1>
<img
width="250px" height="250px"
src="https://github.com/dwp/ms-html-to-pdfa/raw/master/"
alt="base64 encoded embedded image"
/>
</body>
</html>
Index: 0, Size: 0
or Index 0 out-of-bounds for length 0
which, whilst not a very clear, is because the required font is not present in the embedded list array. All html tags should have an attached font (both normal and monospaced)/version-info
Endpoint to return a standard JSON document with build information.
project.artifactId
project.version
maven.build.timestamp
example output is:-
{
"app": {
"name": "ms-html-to-pdfa",
"version": "1.6.0",
"build": "133",
"build_time": "2019-09-09T09:58:17Z"
}
}
The following will base64 encode the html file contents, call the service, decode the response and write to file on *nix based operating systems
curl -m 10 -X POST --data '{"page_html":"'$(cat src/test/resources/successfulHtml.html | base64)'"}' http://localhost:6677/generatePdf | base64 -D > test.pdf
This example will return the current build information
curl http://localhost:6677/version-info
For general information about the CI pipeline on this repository please see documentation at: https://confluence.service.dwpcloud.uk/x/_65dCg
Pipeline Invocation
This CI Pipeline now replaces the Jenkins Build CI Process for the ms-html-to-pdfa
.
Gitlab CI will automatically invoke a pipeline run when pushing to a feature branch (this can be prevented using [skip ci]
in your commit message if not required).
When a feature branch is merged into develop
it will automatically start a develop
pipeline and build the required artifacts.
For production releases please see the release process documented at: https://confluence.service.dwpcloud.uk/pages/viewpage.action?spaceKey=DHWA&title=SRE A production release requires a manual pipeline (to be invoked by an SRE) this is only a release function. Production credentials are required.
localdev Usage
There is no change to the usage of localdev. The gitlab CI Build process create artifacts using the same naming convention as the old (no longer utilised) Jenkins CI Build process.
Therefore please continue to use branch-develop
or branch-f-*
(depending on branch name) for proving any feature changes.
Access
While this repository is open internally for read, no one has write access to this repository by default. To obtain access to this repository please contact #ask-health-platform within slack and a member will grant the appropriate level of access.