bcgov / jrcc-document-access-libs

GNU General Public License v3.0
1 stars 4 forks source link

jrcc-document-access-libs

This library provides a service exchange documents between micro services.

jrcc document access spring boot starter

This provides a spring boot starter for the document access lib using redis

Usage

Step 1

Add jrcc-access-spring-boot-starter to your project (See jrcc-access-spring-boot-sample-app pom.xml as an example)

<dependency>
    <groupId>ca.bc.gov.open</groupId>
    <artifactId>jrcc-access-spring-boot-starter</artifactId>
    <version>1.1.0</version>
</dependency>

Step 2

Add settings into the application.settings or application.yml file.

Use following settings to config logging level and logging message.
 logging.level.root = INFO  
 logging.level.ca.gov.bc = DEBUG
 logging.pattern.console = "%d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} %X{transaction.filename} %X{transaction.id}- %msg%n"

logging.pattern.console property only works if we use Logback logging implementation (the default). The pattern which is needed to be specified also follows the Logback layout rules

%X{transaction.filename} is the java logging MDC key for transaction filename supported in jrcc-document-access-libs. %X{transaction.id} is java logging MDC key for transaction id supported in jrcc-document-access-libs.

Change settings for input/output plugins - using the following configuration guide for Plugin.

Plugins

Common Options

name definition required
bcgov.access.input.document-type String No
bcgov.access.input.plugin String Yes
bcgov.access.input.sender String No

bcgov.access.input.document-type

Sets the document type to be manipulated

bcgov.access.input.plugin

Sets the plugin type

bcgov.access.input.sender

Sets the sender of the request

Input Plugins

You can configure the document input using bcgov.access.input.plugin property.

Console Input Plugin

Description

Reads document form the standard input. Each document is assumed to be one line

Configuration

bcgov.access.input.plugin=console

Input Configuration Options

There are no special configuration options for this plugin, but it does support the Common Options.

Http Input Plugin

Description

Using this input you can receive a single document over http(s). For more details have a look at the document API.

Setup

bcgov.access.input.plugin=http

Make sure "spring.main.web-application-type: none"(which will block all http operation) is not present the application.settings or application.yml file.

You can configure the web server using standard spring configuration. Document sent to the api are handle with the default documentReadyHandler.

Configuration Options

There are no special configuration options for this plugin, but it does support the Common Options and spring standard EMBEDDED SERVER CONFIGURATION (ServerProperties). Common Application Properties

Example to run the service on port 5050

server.port=5050

RabbitMq Input Plugin

Description

Using this plugin you can receive JSON format messages from a specified rabbitMq queue(in our program, it is test-doc.0s.x0.q). To publish, visit the RabbitMQ Management console (accessible on port 15672), navigate to the Queues tab and scroll down to the Publish message section.

The message Payload should be like following:

{
    "transactionInfo":{
         "fileName":"filename.txt",
         "sender":"unknown",
         "receivedOn":"2019-08-21T22:20:45.173"        
    },        
    "documentInfo":{
        "type":"crown-counsel-report"        
    },        
    "documentStorageProperties":{
        "key":"455591e2-6753-4a7f-9438-bedd52327b52",
        "digest":"311743F3D8EC271CA2BB23936C7392F5"        
    }   
} 

The Properties of the published message should be : content_type = application/json.

Note

Make sure this is a property and not a header

The lib will try to get the content from Redis Storage with key and md5 specified in above key and md5.

Setup

bcgov.access.input.plugin=rabbitmq

Configuration Options

It supports the Common Options and the following options:

name type required
bcgov.access.input.rabbitmq.retryDelay Int No
bcgov.access.input.rabbitmq.retryCount Int No
bcgov.access.input.rabbitmq.retryDelay

Sets the delay in seconds between retries when the service is failing to process the message and throwing application known errors.

bcgov.access.input.rabbitmq.retryCount

Sets the maximum attempt to reprocess a message in the queue.

Sftp Input Plugin

Description

Using this plugin you can receive messages from a specified Sftp server when there is a new file. This Sftp plugin will only receive a file once, it uses the server's file timestamp to detect if we've already 'processed' this file. It needs redis data structure store as Metadata Store.

Setup

bcgov.access.input.plugin=sftp

Configuration Options

It support the Common Options and the following options:

name type required
bcgov.access.input.sftp.host String No
bcgov.access.input.sftp.port Int No
bcgov.access.input.sftp.username String Yes
bcgov.access.input.sftp.password String Yes
bcgov.access.input.sftp.remote-directory String Yes
bcgov.access.input.sftp.filter-pattern String No
bcgov.access.input.sftp.cron String Yes
bcgov.access.input.sftp.max-message-per-poll String No
bcgov.access.input.sftp.ssh-private-key Resource No
bcgov.access.input.sftp.ssh-private-passphrase String No
bcgov.access.input.sftp.allow-unknown-keys boolean No
bcgov.access.input.sftp.known-host-file String Yes (if allow-unknown-key is false)
bcgov.access.input.sftp.server-alive-interval String No
bcgov.access.input.sftp.caching-session-wait-timeout Int No
bcgov.access.input.sftp.caching-session-max-pool-size Int No
bcgov.access.input.sftp.host

Sets the sftp server host

bcgov.access.input.sftp.port

Sets the sftp server port

bcgov.access.input.sftp.username

Sets the sftp server username

bcgov.access.input.sftp.password

Sets the sftp server password

bcgov.access.input.sftp.remote-directory

Sets the sftp server remote directory.

bcgov.access.input.sftp.filter-pattern

Sets a regular expression to filter the list.

bcgov.access.input.sftp.cron

Sets a cron tab expression with 6 fields.

bcgov.access.input.sftp.max-message-per-poll

Sets the maximum message per poll.

bcgov.access.input.sftp.ssh-private-key

Sets the location of the private key.

bcgov.access.input.sftp.ssh-private-passphrase

Sets the passphrase for the private key.

bcgov.access.input.sftp.allow-unknown-keys

When no UserInfo has been provided, set to true to unconditionally allow connecting to an unknown host or when a host's key has changed (see knownHosts)

bcgov.access.input.sftp.known-host-file

Specifies the filename that will be used for a host key repository. The file has the same format as OpenSSH's known_hosts file. If allow-unknown-key is false, this property must be set correctly, or KnownHostFileNotDefinedException or KnownHostFileNotFoundException will be thrown. If allow-unknown-key is true, this property will be ignored.

bcgov.access.input.sftp.server-alive-interval

Sets the timeout interval (in milliseconds) before a server-alive message is sent, in case no message is received from the server.

bcgov.access.input.sftp.caching-session-wait-timeout

Sets the limit of how long to wait for a session to become available.

bcgov.access.input.sftp.caching-session-max-pool-size

Modify the target session pool size; the actual pool size will adjust up/down to this size as and when sessions are requested or retrieved.

Output Plugins

You can configure the document input using the bcgov.access.output property.

Console Output Plugin

Description

A simple output which prints document information to STDOUT. The console output is mostly used when testing the application configuration.

Setup

bcgov.access.output.plugin=console

Configuration Options

It supports the [Common Options](#Common Options) and the following options:

name type required
bcgov.access.output.console.format String No
bcgov.access.output.console.format

When set to default the output is truncated to 100 chars. When set to xml the plugins tries to prettify the xml document or return the content of the document

bcgov.access.output.console.format=xml

RabbitMq Output Plugin

Description

Push documents to a RabbitMq exchange and store document to Redis Cache.

Setup

bcgov.access.output.plugin=rabbitmq

Configuration Options

It supports the Common Options and the following options:

name type required
bcgov.access.output.rabbitmq.ttl Int No
bcgov.access.output.rabbitmq.ttl

Sets the time to live for the document in the temporary storage (expressed in hours)

bcgov.access.output.rabbitmq.ttl

Processor

You can register a processor to transform the content of the message.

To register a processor do the following:

Create a new spring component that implements DocumentProcessor

@Component
public class UpperCaseProcessor implements DocumentProcessor {

    @Override
    public String processDocument(String content, TransactionInfo transactionInfo) {
        return content.toUpperCase(Locale.CANADA);
    }
}

When registered, the processor will act on the input document content. For example, in the case shown above all input content will be converted to upper case.

References

Sample App

The sample app is a demo that shows the usage of jrcc-access-spring-boot-starter

Prerequisites

Running this application requires Apache Maven which in-turn has a dependency on Java. As a result, the following will need to be installed:

Notes

Installation Steps

Install jrcc-access-libs

Run the following command: mvn clean install

Run the sample

mvn clean install -P sample-app
mvn spring-boot:run -f jrcc-access-spring-boot-sample-app/pom.xml

This app is configured to receive document using the http plugin like following in application.yml

logging:
  level:
    ca:
      gov:
        bc: DEBUG
bcgov:
  access:
    input:
      sender: bcgov
      document-type: test-doc
      plugin: http
    output:
      document-type: test-doc
      plugin: console

you can use this Postman collection to interact with the server.

For body, select form-data and input key value as "file" and select file. set the http header to Content-Type: multipart/form-data.

Postman config

If you want to run the sample app using redis and rabbitmq do the following

Create a redis container

docker run --name some-redis -p 6379:6379 -d redis

Create a rabbit container

docker run -d --hostname some-rabbit --name some-rabbit -p 15672:15672 -p 5672:5672 rabbitmq:3-management

update the application.yml

bcgov:
  access:
    input: http
      sender: bcgov
    output:
      document-type: test-doc
      plugin: rabbitmq
      rabbitmq:
         ttl: 1
logging:
  level:
    ca:
      gov:
        bc: DEBUG

To view the message in a queue, login to rabbitmq management console with default guest/guest and create a binding to the document.ready exchange using test-doc routing key

binding

If you want to run the sample app using sftp do the following:

step 0. Create a redis container

docker run --name some-redis -p 6379:6379 -d redis

step 1. Create a sftp server container (from WindowsPowerShell or GitBash)

docker run -p 22:22 -d atmoz/sftp myname:pass:::upload

step 2. User "myname" with password "pass" can login with sftp and upload files to a folder called "upload". We are forwarding the container's port 22 to the host's port 22.

step 3. Use a Sftp Client application ( such as Filezilla, WinSCP, coreFTP) to connect to the server.(use sftp protocal and ip: localhost, port:22)

step 4. If you do not want to unconditionally allow connecting to an unknown host or when a host's key has changed, you need to provide a known_hosts file. Use the following command to generate a known_hosts file for started sftp server (from WindowsPowerShell or GitBash).

ssh-keyscan -v -p 22 localhost>>known_hosts

step 5. Update the application.yml

main:
      web-application-type: none
logging:
  level:
    ca:
      gov:
        bc: DEBUG
  pattern:
    console: "%d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} %X{transaction.filename} - %msg%n"
    file: "%d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} %X{transaction.filename} - %msg%n"
bcgov:
  access:
    input:
      sender: bcgov
      document-type: test-doc
      plugin: sftp
      sftp:
        host: localhost
        port: 22
        username: myname
        password: pass
        remote-directory: /upload
        max-message-per-poll: 5
        cron: 0/5 * * * * *
        allow-unknown-keys: false
        known-host-file: C:\Users\user\.ssh\known_hosts
    output:
      document-type: test-doc
      plugin: console

Then start the sample application and use Sftp client to drag a file from your local file system to remote upload folder. The sample application should process the file and output it.

Release

To create a new release run on develop branch

mvn versions:set -DartifactId=*  -DgroupId=*

it will prompt you for the new version

do a pull request against dev

Contributing to the repository

To contribute to the repo, please fork the repository and submit a pull request to the team. We will look into it accordingly.

This project has been configured to use git hooks; specifically the precommit hook. Thus, on every commit the mvn clean install command will be executed which will run all your tests and ensure that your code successfully builds and compiles. This helps to ensure bug-free development and deployment.

To set up the hooks to work, please run the following command from the root of the repo:

bash ./scripts/install-hooks.bash

Note: Make sure git cmd folder is added to the environment variable path. For GitKraken user, make sure "Path to sh.exe" is set in GitKraken. (File=>Preference=>General=>Path to sh.exe)