Enterprise-CMCS / macpro-platform-doc-conversion

Other
2 stars 0 forks source link

macpro-platform-doc-conversion Build Maintainability CodeQL Dependabot code style: prettier Test Coverage

MACPRO Platform document conversion APIs.

Initial API:

Release

Our product is promoted through branches. Master is merged to val to affect a master release, and val is merged to production to affect a production release. Please use the buttons below to promote/release code to higher environments.

branch status release
master master release to master
val val release to val
production production release to production

Architecture

Architecture Diagram

Usage

See master build here

This application is built and deployed via GitHub Actions.

Want to deploy from your Mac?

Requirements

Node - we enforce using a specific version of node, specified in the file .nvmrc. This version matches the Lambda runtime. We recommend managing node versions using NVM.

Serverless - Get help installing it here: Serverless Getting Started page

Yarn - in order to install dependencies, you need to install yarn.

AWS Account: You'll need an AWS account with appropriate IAM permissions (admin recommended) to deploy this app in Amazon.

If you are on a Mac, you should be able to install all the dependencies like so:

# install nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.37.2/install.sh | bash

# select the version specified in .nvmrc
nvm install
nvm use

# install yarn
brew install yarn

If you'd like to test deploying prior to committing, you can deploy to AWS as follows:

./deploy.sh <branch name>

# Quick and dirty test where "test.hmtl" is a valid html file that is already 508 compliant.
# First we base64 encode the html:
# base64 -i ~/Desktop/test.html -o test_b64.html

# Note output will be a little garbled since we're filtering out special chars
# To properly validate the output perform these steps in JS or Python and decode the API response from base64
# API ID will be output from the deploy
curl -F "data=~@~/Desktop/test_b64.html" --tlsv1.2 https://<API ID>.execute-api.us-east-1.amazonaws.com/<branch name>/prince | sed 's/^"//; s/"$//' | base64 -d > ~/Desktop/test.pdf

# to clean up
./destroy.sh <branch name>

Invocation example #1 (IAM Off)

We are starting with a simplified example, where no authorization is required. To disable authorization, comment out authorizer: aws_iam in the services/app-api/serverless.yml as well as removing the resourcePolicy block. To run the Python example calling your deployed API:

# Setting up a Python virtualenv is beyond the scope of this guide.
# Below assumes Python 3.8 and a pyenv virtual environment dedicated for calling this API
pyenv activate my-prince-virtual-env
pip install -r examples/python/requirements.txt
python examples/python/call_prince.py https://abc123.execute-api.us-east-1.amazonaws.com/master/prince ~/Desktop
508 html being converted to pdf:

<html lang="en">
        <head>
...
        </body>
      </html>

sending request to https://abc123.execute-api.us-east-1.amazonaws.com/master/prince:
<bound method Response.json of <Response [200]>>
508 PDF written to: /Users/jeffreysobchak/Desktop/prince-master.pdf

Invocation example #2 (IAM Authorization) in EC2 or ECS (EC2, Fargate)

By default authorization is handled by IAM via a resource policy on the API Gateway. For more details, consult the readme in services/app-api.

The call_prince_iam.py script is using the AWS V4 signing process to sign the request that is submitted to the API. This is a requirement when using IAM authentication. This is a complicated process, and packages exist to handle this for you. The example uses BotoAWSRequestsAuth for Python.

For this example, the IAM role attached to the ec2 instance is:

arn:aws:iam::<AWS ACCOUNT NO>:role/delegatedadmin/developer/This-Is-An-IAM-Role-for-EC2

The above IAM role would need to be placed in the SSM StringList parameter for allowed ARNs that can invoke the API, ensuring the IAM role gets added to the API Gateway resource policy:

/configuration/my-branch-name/macpro-platform-doc-conversion/iam/invoke-arns

Additionally, the IAM role needs "execute-api:Invoke" on "Resource": "arn:aws:execute-api:us-east-1:*.

NOTE: for ECS, the IAM role that has the above execute-api permissions needs to be assigned to the Task role, NOT the Task execution role. The latter is used to pull container images and publish container logs, while the former is the role your invoking code uses. These two roles are named confusingly by AWS.

IAM authentication and resource policy covers the why in extensive detail. TL;DR both the identity policy (invoker) and resource policy (API Gateway) matter.

# Connect to ec2 instance via SSM
# Below assumes python3 and pip3 installed already or .debug ec2 instance being used
sh-4.2$ cd ~
sh-4.2$ git clone https://github.com/Enterprise-CMCS/macpro-platform-doc-conversion
sh-4.2$ cd macpro-platform-doc-conversion
sh-4.2$ git checkout -b my-branch-name
sh-4.2$ pip3 install -r examples/python/requirements.txt
sh-4.2$ python3 examples/python/call_prince_iam.py https://<API ID>.execute-api.us-east-1.amazonaws.com/<STAGE NAME>/prince ~
508 html being converted to pdf:

<html lang="en">
        <head>
          <title>APS print page</title>
        </head>
        <body>
          <img
            alt="SC state logo"
            src="https://i.pinimg.com/originals/c4/52/04/c4520440b727695b5aca89e7afa2e7e3.jpg"
            width="50"
          />
          <p style={{ "border-top": "1px solid black" }}>&nbsp;</p>
          <h1>Amendment to Planned Settlement (APS)</h1>
          <p>&nbsp;</p>
          <p>APD-ID: ND-0001</p>
          <p>Submitter: Jeffrey Sobchak</p>
          <p>Submitter Email: jeffrey.sobchak@gmail.com</p>
          <p>Urgent?: false</p>
          <p>Comments:</p>
        </body>
      </html>

sending request to https://abc123.execute-api.us-east-1.amazonaws.com/my-branch-name/prince:
<bound method Response.json of <Response [200]>>
508 PDF written to: /home/ssm-user/my-branch-name.pdf
sh-4.2$

Invocation example #3 (IAM Authorization) in a Lambda Function

In this example, all permission requirements and SSM setup for the invoker remain the same as example #2. What changes is the invocation code. For Lambda will we work with examples/python/lambda_handler.py.

This example differs from the previous in that it reads an html input file from an S3 bucket and writes it to an S3 bucket, so the IAM role for the Lambda function would also need permissions for this.

AWS's instructions for packaging and deploying a Python Lambda function can be used to deploy this example handler. The dependencies that need included are in examples/python/requirements.txt

Example test input lambda function (can be tested in AWS console):

{
  "api_endpoint": "https://abc123.execute-api.us-east-1.amazonaws.com/my-branch-name/prince",
  "input_bucket": "my-test-bucket-name",
  "input_file": "test.html",
  "output_location": "my-test-bucket-name"
}

Fonts

This API supports Open Sans and DejaVu Sans fonts natively. The latter font is primarily used to ensure Ballot Box and Checked Ballot Box characters are available. If you wish to call this API with another font, it can be imported via url. An example can be found in the test html data. Alternatively, you can request the MACPRO Platform team load the font into the API. The former is preferred, as Lambda package ZIPs have size limits.

Contributing / To-Do

See current open issues or check out the project board.

Please feel free to open new issues for defects or enhancements.

To contribute:

Pull requests are being accepted.

License

License

See LICENSE for full details.

As a work of the United States Government, this project is
in the public domain within the United States.

Additionally, we waive copyright and related rights in the
work worldwide through the CC0 1.0 Universal public domain dedication.

Slack channel

To enable slack integration, set a value for SLACK_WEBHOOK_URL in github actions secret.

To set the SLACK_WEBHOOK_URL: