Probesys / lotemplate

LOTemplate is document generator used to create documents programatically (ODT, DOCX, PDF) from a template (DOCX or ODT) and a json file.
GNU Affero General Public License v3.0
23 stars 1 forks source link
document-generator libreoffice text-document

LOTemplate (for Libre Office Template)

Unittest

LOTemplate is document generator used to create documents programatically (ODT, DOCX, PDF) from a template and a json file.

flowchart LR
    template["Template<br/>(DOCX or ODT)"]
    json["Data<br/>(JSON)"]
    lotemplate["LO Template<br/>(accessible by API or CLI)"]
    generatedFile["Generated File<br/>(PDF, DOCX, ODT, RTF,...)"]

    template --> lotemplate
    json --> lotemplate
    lotemplate --> generatedFile

What makes this tool different from others are the following features :

The tool is written in Python and use a real LibreOffice headless to fill the templates.

Quick start

Run the project with docker compose

Create a docker-compose.yml

version: '3'
services:
  lotemplate:
    image: probesys38/lotemplate:v1.5.0
    volumes:
      - lotemplate-uploads:/app/uploads
    environment:
      - SECRET_KEY=lopassword
    command: "gunicorn -w 4 -b 0.0.0.0:8000 app:app"

run the service

docker-compose up -d

Use the API

# creation of a directory
curl -X PUT -H 'secret_key: lopassword' -H 'directory: test_dir1' http://localhost:8000/
# {"directory":"test_dir1","message":"Successfully created"}

Let's imagine we have an file basic_test.odt (created by libreoffice) like this :

Test document

let’s see if the tag $my_tag is replaced and this $other_tag is detected.
[if $my_tag == foo]My tag is foo[endif]
[if $my_tag != foo]My tag is not foo[endif]

Upload this file to lotemplate

# upload a template
curl -X PUT -H 'secret_key: lopassword' -F file=@/tmp/basic_test.odt http://localhost:8000/test_dir1
# {"file":"basic_test.odt","message":"Successfully uploaded","variables":{"my_tag":{"type":"text","value":""},"other_tag":{"type":"text","value":""}}}

# generate a file titi.odt from a template and a json content
curl -X POST \
    -H 'secret_key: lopassword' \
    -H 'Content-Type: application/json' \
    -d '[{"name":"my_file.odt","variables":{"my_tag":{"type":"text","value":"foo"},"other_tag":{"type":"text","value":"bar"}}}]' \
    --output titi.odt http://localhost:8000/test_dir1/basic_test.odt 

After the operation, you get the file titi.odt with this content :

Test document

let’s see if the tag foo is replaced and this bar is detected.

My tag is foo

Table of content

Installation

Requirements

For Docker use of the API, you can skip this step.

# on debian bookworm, you can use these commands
apt update
apt -y -t install bash python3 python3-uno python3-pip libreoffice-nogui
pip install -r requirements.txt

Run the API

Run the following command on your server :

python3 -m flask run

or simply

flask run

or, for Docker deployment:

docker-compose up

Basic Usage

With the API

Examples of curl requests

# creation of a directory
curl -X PUT -H 'secret_key: my_secret_key' -H 'directory: test_dir1' http://lotemplate:8000/
# {"directory":"test_dir1","message":"Successfully created"}
curl -X PUT -H 'secret_key: my_secret_key' -H 'directory: test_dir2' http://lotemplate:8000/
# {"directory":"test_dir2","message":"Successfully created"}

# look at the created directories
curl -X GET -H 'secret_key: my_secret_key' http://lotemplate:8000/
# ["test_dir2","test_dir1"]

# delete a directory (and it's content
curl -X DELETE -H 'secret_key: my_secret_key' http://lotemplate:8000/test_dir2
# {"directory":"test_dir2","message":"The directory and all his content has been deleted"}

# look at the directories
curl -X GET -H 'secret_key: my_secret_key' http://lotemplate:8000/
# ["test_dir1"]

Let's imagine we have an odt file (created by libreoffice) like this :

Test document

let’s see if the tag $my_tag is replaced and this $other_tag is detected.

Upload this file to lotemplate

# upload a template
curl -X PUT -H 'secret_key: my_secret_key' -F file=@/tmp/basic_test.odt http://lotemplate:8000/test_dir1
{"file":"basic_test.odt","message":"Successfully uploaded","variables":{"my_tag":{"type":"text","value":""},"other_tag":{"type":"text","value":""}}}

# analyse an existing file and get variables
curl -X GET -H 'secret_key: my_secret_key'  http://lotemplate:8000/test_dir1/basic_test.odt
# {"file":"basic_test.odt","message":"Successfully scanned","variables":{"my_tag":{"type":"text","value":""},"other_tag":{"type":"text","value":""}}}

# generate a file titi.odt from a template and a json content
 curl -X POST -H 'secret_key: my_secret_key' -H 'Content-Type: application/json' -d '[{"name":"my_file.odt","variables":{"my_tag":{"type":"text","value":"foo"},"other_tag":{"type":"text","value":"bar"}}}]' --output titi.odt http://lotemplate:8000/test_dir1/basic_test.odt 

After the operation, you get the file titi.odt with this content :

Test document

let’s see if the tag foo is replaced and this bar is detected.

API reference

Then use the following routes :

all routes take a secret key in the header, key secret_key, that correspond to the secret key configured in the .env file. If no secret key is configured, the secret key isn't required at request.

you may wish to deploy the API on your server. Here's how to do it - but don't forget that you should have soffice installed on the server

You can also change the flask options - like port and ip - in the .flaskenv file. If you're deploying the app with Docker, port and ip are editable in the Dockerfile. You can also specify the host and port used to run and connect to soffice as command line arguments, or in a config file (config.yml/config.ini/config or specified via --config).

Execute and use the CLI

Run the script with the following arguments :

usage: lotemplate_cli.py [-h] [--json_file JSON_FILE [JSON_FILE ...]]
                         [--json JSON [JSON ...]] [--output OUTPUT]
                         [--config CONFIG] [--host HOST] [--port PORT]
                         [--scan] [--force_replacement] template_file

positional arguments:
  template_file         Template file to scan or fill

optional arguments:
  -h, --help            show this help message and exit
  --json_file JSON_FILE [JSON_FILE ...], -jf JSON_FILE [JSON_FILE ...]
                        Json files that must fill the template, if any
  --json JSON [JSON ...], -j JSON [JSON ...]
                        Json strings that must fill the template, if any
  --output OUTPUT, -o OUTPUT
                        Names of the filled files, if the template should
                        be filled. supported formats: pdf, html, docx, png, odt
  --config CONFIG, -c CONFIG
                        Configuration file path
  --host HOST           Host address to use for the libreoffice connection
  --port PORT           Port to use for the libreoffice connexion
  --scan, -s            Specify if the program should just scan the template
                        and return the information, or fill it.
  --force_replacement, -f
                        Specify if the program should ignore the scan's result

Args that start with '--' (e.g. --json) can also be set in a config file (config.yml/config.ini/config or specified via --config). Config file syntax allows: key=value, flag=true, stuff=[a,b,c] (for details, see syntax). If an arg is specified in more than one place, then commandline values override config file values which override defaults.

All the specified files can be local or network-based.

Get a file to fill with the --scan argument, and fill the fields you want. Add elements to the list of an array to dynamically add rows. Then pass the file, and the completed json file(s) (using --json_file) to fill it.

Template syntax and examples

text variables

Put $variable in the document is enough to add the variable 'variable'.

A variable name is only composed by chars, letters or underscores.

You can also use "function variables". It is exactly the same as simple variables but with a syntax that allows you a more flexible variable name :

examples :

# simple variable
$my_var
$MyVar99_2020

# basic function variable
$my_var(firstName)
$my_var(address.city)

# you have to escape parenthesis inside the parameter in the
# variable name with a backslash
$my_var(firstName\(robert\))

Then in the json, function variable are working exactly like simple variables.

{
  "my_var": {
    "type": "text",
    "value": "my value"
  },
  "MyVar99_2020": {
    "type": "text",
    "value": "my value"
  },
  "my_var(firstName)": {
    "type": "text",
    "value": "my value"
  },
  "my_var(address.city)": {
    "type": "text",
    "value": "my value"
  },
  "my_var(firstName\\(robert\\))": {
    "type": "text",
    "value": "my value"
  }
}

html variables

Html variables are exactly like text variables, but the html is interpreted when the variable is replaced in the document.

To declare a variable as an html variable, we only have to change the type in the json and send "html" instead of "text".

{
  "my_var": {
    "type": "html",
    "value": "my value with <strong>html formatting</strong>"
  }
}

Limitation : Html is not interpreted into "shape content". For example for a text associated to a rectangle inserted into the document.

image variables

Add any image in the document, and put in the title of the alt text of the image (properties) '$' followed by the desired name ('$image' for example to add the image 'image')

dynamic arrays

You can add an unknown number of rows to the array but only on the last line. Add the dynamic variables in the last row of the table, exactly like text variables, but with a '&' instead of a the '$'

if statement

You can use if statement in order to display or to hide a part of your document.

There is many operators :

[if $my_var == my_value]
$my_var equals my_value (case insensitive)
[endif]

[if $my_var === my_value]
$my_var equals my_value (case sensitive)
[endif]

[if $my_var != my_value]
$my_var does not equals my_value (case insensitive)
[endif]

[if $my_var !== my_value]
$my_var does not equals my_value (case sensitive)
[endif]

[if $my_var CONTAINS y_VAl]
This part will be displayed if $my_var contains y_VAl (case insensitive)
[endif]

[if $my_var NOT_CONTAINS dlfksqjqm]
This part will be displayed if $my_var does not contain dlfksqjqm
[endif]

[if $my_var IS_EMPTY]
This part will be displayed if my_var is empty (empty means empty or only spaces, tabs or newlines)
[endif]

[if $my_var IS_NOT_EMPTY]
This part will be displayed if my_var is empty (empty means empty or only spaces, tabs or newlines)
[endif]

You can put an if statement inside another if statement

[if $foo == my_value]
[if $bar == my_value2]
here you have your document
[endif]
[if $bar == my_value3]
here you have your document
[endif]
[endif]

for statement

You can use for statement in order to display a part of your document multiple times.

WARNING : the for system loses the formating of the template. If you want a specific formating, you have to put it in an HTML statement.

You have to send an array in the json file with a dict inside the array

{
  "tutu": {
    "type": "array",
    "value": [
      {
        "firstName": "perso 1",
        "lastName": "string 1",
        "address": {
          "street1": "8 rue de la paix",
          "street2": "",
          "zip": "75008",
          "city": "Paris",
          "state": "Ile de France"
        }
      },
      {
        "firstName": "perso 2",
        "lastName": "lastname with < and >",
        "address": {
          "street1": "12 avenue Jean Jaurès",
          "street2": "",
          "zip": "38000",
          "city": "Grenoble",
          "state": "Isère"
        }
      }
    ]
  }
}

Then in your template file you can use the for like this :

Tests of for statements

[for $tutu]
Associate number [forindex]
first name : [foritem firstName] 
last name escaped by default [foritem lastName]
last name escaped html [foritem lastName escape_html]
last name not escaped [foritem lastName raw]
Address : 
[foritem address.street1]
[foritem address.zip] [foritem address.city]
[endfor]

Note : you can use if inside for statements

Here we display only the people living in Grenoble

[for $tutu]
[if [foritem address.city] == Grenoble]
first name : [foritem firstName] 
last name : [foritem lastName]
Address : 
[foritem address.street1]
[foritem address.zip] [foritem address.city]
[endif]
[endfor]

Here we display only the first element of the array

[for $tutu]
[if [forindex] == 0]
first name : [foritem firstName] 
last name : [foritem lastName]
[endif]
[endfor]

Note : If you are using [forindex] inside a variable name, the variable is excluded from the parsing of the template. It allows you to create a dynamic variable name inside a for loop. Ex : $my_var(people.[forindex].name) is excluded from the variable parsing.

html statement

You can use html statement in order to display a part of your document with a specific formating.

Here is some examples that ca be use inside an odt template

[html]
<table>
<tr>
  <td>First Name</td>
  <td>Last Name</td>
</tr>
<tr>
  <td>First Name</td>
  <td>Last Name</td>
</tr>
</table>
[endhtml]

Then all the html content is interpreted and pasted as html. It is then rendered as a formated text (a table in this example) inside the document.

You can also display a variable that contains an html content inside an html statement

[html]
$my_html_variable
[endhtml]

with the associated json :

{
  "tutu": {
    "type": "text",
    "value": "my <strong>html formated</strong> content"
  }
}

As the for statement removes formatting, you can use the html statement combined with the for statement to display a table with a specific formating.

Let's see an example with the following json :

{
  "tutu": {
    "type": "array",
    "value": [
      {
        "firstName": "perso 1",
        "lastName": "string 1",
        "address": {
          "street1": "8 rue de la paix",
          "street2": "",
          "zip": "75008",
          "city": "Paris",
          "state": "Ile de France"
        }
      },
      {
        "firstName": "perso 2",
        "lastName": "lastname with < and >",
        "address": {
          "street1": "12 avenue Jean Jaurès",
          "street2": "",
          "zip": "38000",
          "city": "Grenoble",
          "state": "Isère"
        }
      }
    ]
  }
}

and the odt template :

[html]
<table>
<tr>
  <td>First Name</td>
  <td>Last Name</td>
</tr>
[for $tutu]
<tr>
  <td>[foritem firstName escape_html]</td>
  <td>[foritem lastName escape_html]</td>
</tr>
[endfor]
</table>
[endhtml]

counter statement

You can use counter statement in order to display values that will be incremented step by step in your document. It can be used for example to have a heading automatic numbering.

WARNING : generally you don't have to use this counter feature. You can use the automatic numbering of Word or Libre Office

basic usage of counter

In your odt, you can use :

chapter [counter chapter] : introduction

chapter [counter chapter] : context

[counter paragraph] : geopolitical context

[counter paragraph] : geographical context

[counter paragraph] : economical context

chapter [counter chapter] : analysis
[counter.reset paragraph]
[counter paragraph] : geopolitical analysis

[counter paragraph] : geographical analysis

[counter paragraph] : economical analysis

It will be transformed in :

chapter 1 : introduction

chapter 2 : context

1 : geopolitical context

2 : geographical context

3 : economical context

chapter 3 : analysis

1 : geopolitical analysis

2 : geographical analysis

3 : economical analysis

The possible syntaxe are :

You can display hierarchical counters by just using counter.last:

chapter [counter chapter] : introduction

chapter [counter chapter] : context

[counter.last chapter].[counter paragraph] : geopolitical context

[counter.last chapter].[counter paragraph] : geographical context

[counter.last chapter].[counter paragraph] : economical context

chapter [counter chapter] : analysis
[counter.reset paragraph]
[counter.last chapter].[counter paragraph] : geopolitical analysis

[counter.last chapter].[counter paragraph] : geographical analysis

[counter.last chapter].[counter paragraph] : economical analysis

The result will be

chapter 1 : introduction

chapter 2 : context

2.1 : geopolitical context

2.2 : geographical context

2.3 : economical context

chapter 3 : analysis

3.1 : geopolitical analysis

3.2 : geographical analysis

3.3 : economical analysis

count the number of elements of a list

[counter.reset iterator]

[for $solutions]

Title : [foritem title]

Content : [foritem paragraph]

[counter iterator hidden]
[endfor]

we displayed [counter.last iterator] solutions

Supported formats

Import

Format ODT, OTT HTML DOC, DOCX RTF TXT OTHER
text variables support
dynamic tables support
image variables support

Export

odt, pdf, html, docx.

Other formats can be easily added by adding the format information in the dictionary formats in lotemplate/classes.py > Template > export().

Format information can be found on the unoconv repo.

Doc for developpers of lotemplate

Run the tests

You need to have docker and docker-compose installed and then run

make tests

Installation with Docker for dev when your uid is not 1000

for this we use fixuid (https://github.com/boxboat/fixuid)

you have to define two env variable MY_UID and MY_GID with your uid and gid copy docker-compose.override.yml.example to docker-compose.override.yml

export MY_UID=$(id -u)
export MY_GID=$(id -g)
cp docker-compose.override.yml.example docker-compose.override.yml
docker-compose up

Unsolvable problems

The error UnoException happens frequently and unpredictably, and this error stops the soffice processus (please note that the API try to re-launch the process by itself). This error, particularly annoying, is unfortunately impossible to fix, since it can be caused by multiples soffice (LibreOffice) bugs. Here is a non-exhaustive list of cases that can cause this bug :

The amount of memory used by soffice can increase with its use, even when open files are properly closed (which is the case). Again, this is a bug in LibreOffice/soffice that has existed for years.

For trying to fix these problems, you can try:

External documentations

Versions

Possible futur evolutions