edissyum / opencaptureformem

GNU General Public License v3.0
2 stars 1 forks source link

Logo Open-Capture

Open-Capture for MEM Courrier Open-Capture For Mem deployment

Open-Capture for MEM Courrier is a free and Open Source software under GNU General Public License v3.0.

Installation

Linux Distributions

Open-Capture for MEM Courrier is only tested on Debian distribution.

For the latest version (4.X.X) you need to install Debian 12 (Bookworm).

Install Open-Capture for MEM Courrier

Nothing as simple as that :

sudo mkdir -p /opt/edissyum/ && sudo chmod -R 775 /opt/edissyum/ && sudo chown -R $(whoami):$(whoami) /opt/edissyum/
sudo apt install git
latest_tag=$(git ls-remote --tags --sort="v:refname" https://github.com/edissyum/opencaptureformem.git 4.* | tail -n1 | sed 's/.*\///; s/\^{}//')
git clone -b $latest_tag https://github.com/edissyum/opencaptureformem /opt/edissyum/opencaptureformem/
cd /opt/edissyum/opencaptureformem/install/

The ./install.sh install all the necessary packages and create the service You have the choice between using supervisor or basic systemd Supervisor is useful if you need to run multiple instance of Open-Capture in parallel Systemd is perfect for one instance

chmod u+x install.sh
sudo ./install.sh
  # Answer the few questions asked at launch
  # Go grab a coffee ;)

You can also launch installation with predefined settings :

sudo ./install.sh --user edissyum --supervisor_systemd systemd --secure_rabbit no

Or with secured RabbitMQ :

sudo ./install.sh --user edissyum --supervisor_systemd systemd --secure_rabbit yes --rabbit_user edissyum --rabbit_password edissyum --rabbit_host localhost --rabbit_port 5672 --rabbit_vhost opencapture

It will install all the needed dependencies, compile and install Tesseract V5 with french and english locale. If you need more locales, just do :

sudo apt install tesseract-ocr-<langcode>

Here is a list of all available languages code : https://www.macports.org/ports.php?by=name&substr=tesseract-

Don't forget to modify the two config file with your specifics need. If you need help, you have more informations about the src/config/config.ini settings into the Configuration section. For the src/config/mail.ini just check the IMAP Connector (Open-Capture MailCollect Module) section.

In most cases you had to modify the /etc/ImageMagick-6/policy.xml file to comment the following line (~ line 94) and then restart the oc-worker:

<policy domain="coder" rights="none" pattern="PDF" />

sudo systemctl restart oc-worker.service

Configuration

The file src/config/config.ini is splitted in different categories

To activate auto recontiliation for MEM Courrier outgoing document you must set this list of values in config.ini file (REATTACH_DOCUMENT part) :

- Active : activate the process (True or False)
- Action : reattach action id in MEM Courrier
- group  : id of the scan group in MEM Courrier
- basket : basket id linked to the group in MEM Courrier
- status : the new status after reattach

Utilisations

Here is some examples of possible usages in the launch_XX.sh script:

python3 /opt/edissyum/opencaptureformem/launch_worker.py -c /opt/edissyum/opencaptureformem/src/config/config.ini -f file.pdf -process incoming
python3 /opt/edissyum/opencaptureformem/launch_worker.py -c /opt/edissyum/opencaptureformem/src/config/config.ini -p /path/to/folder/
python3 /opt/edissyum/opencaptureformem/launch_worker.py -c /opt/edissyum/opencaptureformem/src/config/config.ini -p /path/to/folder/ --read-destination-from-filename
python3 /opt/edissyum/opencaptureformem/launch_worker.py -c /opt/edissyum/opencaptureformem/src/config/config.ini -p /path/to/folder/ --read-destination-from-filename -resid 100 -chrono MEM/2019D/1

--read-destination-from-filename is related to separation with QR CODE. It's reading the filename, based on the divider option in config.ini, to find the entity ID -f stands for unique file -p stands for path containing PDF/JPG files and process them as batch -process stands for process mode (incoming or outgoing. If none, incoming will be choose)

Various

If you want to generate PDF/A instead of PDF, you have to do the following :

cp install/sRGB_IEC61966-2-1_black_scaled.icc /usr/share/ghostscript/X.XX/
nano +8 /usr/share/ghostscript/X.XX/lib/PDFA_def.ps

Replace : %/ICCProfile (srgb.icc) % Customise
By : /ICCProfile (/usr/share/ghostscript/X.XX/sRGB_IEC61966-2-1_black_scaled.icc)   % Customize

Open-Capture MailCollect Module

Logo Open-Capture MailCollect

You have the possibility to capture e-mail directly from your inbox.

Just edit the /opt/edissyum/opencaptureformem/src/config/mail.ini and add your process. Modify the default process MAIL_1 with your informations (host, port, login, pwd etc..) If you want to have the from, to, cc and replyTo metadatas you have to create the custom fields into MEM Courrier superadmin dashboard and modify the ID into the config file (8, 9, 10, 11 by default) Add other process if you want to capture more than one mailbox or multiple folder, by copying MAIL_1 and just change the name.

IMPORTANT : Do not put space into process name

I you have multiple processes, don't forget to copy MAIL_1 section into /opt/edissyum/opencaptureformem/src/config/mail.ini and that's all. The launch_MAIL.sh automatically loop into all the processes and launch them

Don't forget to fill the typist with the user_id who scan document (in the default MEM Courrier installation it's bblier)

Here is a short list of options you have for mail process into /opt/edissyum/opencaptureformem/src/config/mail.ini

You could also set-up notifications if an error is thrown while collect mail with IMAP. For that, just fill the following informations :

Hint : If you need to test the SMTP settings, just launch the script /opt/edissyum/opencaptureformem/scripts/MailCollect/smtp_test.py with your hosts informations Hint2 : To know the specific name of different folder, just launch the script /opt/edissyum/opencaptureformem/scripts/MailCollect/check_folders.py with your hosts informations

To makes the capture of e-mail automatic, just cron the launch_MAIL.sh script :

 */5 8-18 * * 1-5   /opt/edissyum/opencaptureformem/scripts/launch_MAIL.sh >/dev/null 2>&1

By default, run the script at every 5th minute past every hour from 8 through 18 on every day-of-week from Monday through Friday.

Possible errors

If you have the following error when running your MailCollect scripts : ssl.SSLError: [SSL: UNSUPPORTED_PROTOCOL] unsupported protocol (_ssl.c:1056) One of the possibility to solve is the following :

sudo nano /etc/ssl/openssl.cnf

Add the following block at the end of the file

[tls_system_default]
MinProtocol = TLSv1.0
CipherString = DEFAULT@SECLEVEL=0

Clean MailCollect batches

When a batch is launch it will create a folder with a backup of the e-mail and the log file associated To avoid lack of memory on the server, do not forget to cron the clean.sh script :

0 2 * * 1-5   /opt/edissyum/opencaptureformem/scripts/MailCollect/clean.sh >/dev/null 2>&1

By default, run the script at 2 AM on every day-of-week from Monday through Friday and it will delete all the batch folder older than 7 days

Update Open-Capture For MEM Courrier

The process of update is very simple. But before you need to modify the file and change lines 54 to put the user and group you want instead of default (edissyum) :

cd /opt/edissyum/opencaptureformem/install/
chmod u+x update.sh
sudo ./update.sh

Open-Capture MailCollect Forms Module

If you have a mailbox receiving only forms, there is this module. On the src/config/forms/forms_identifier.json you'll choose :

- The name of the process "Formulaire_1" in the default JSON file
- keyword_subject --> The keyword we can find in the mail subject to detect the right process
- model_id --> MEM Courrier model identifier
- status --> Override the status set in mail.ini (optional)
- destination --> Override the destination set in mail.ini (optional)
- doctype --> Override the doctype set in mail.ini (optional)
- priority --> Override the priority set in mail.ini (optional)
- json_file --> Name of the JSON file containing all the informations about the form

And in the json_file here is what you can do (ou can use the default one src/config/forms/default_form.json) :

- In FIELDS -> CONTACTS you'll have the default field. You just have to modify the REGEX if it doesn't match your form
- In FIELDS -> LETTERBOX you could add your specifics data
    - column --> use a column of the res_letterbox table. If you want to use <code>custom_fields</code> data, put <code>custom</code> in it
    - regex --> regex used to find the data you want
    - mapping --> If column is equal to custom or if you want to split one line into multiple column you have to fill this (you need as many block of mapping as columns you want) :
        - isCustom --> if the data need to be in custom_fields column
        - isAddress --> If true, the bracket value need to be "LATITUDE,LONGITUDE" and the rest, the complete adress
        - column --> put the id of custom_fields (eg: "3") or a column of res_letterbox table

If you want specific data you could use [] into your line. For example you could check the example_form.json and example_form.txt to see the settings

Informations

QRCode separation

MEM Courrier permit the creation of separator, with QRCODE containing the ID of an entity. "DGS" for example. If enabled is config.ini, the separation allow us to split a PDF file containing QR Code and create PDF with a filename prefixed with the entity ID. e.g : "DGS_XXXX.pdf" On the new version 20.03 the separator now put entity ID instead of entity short label. But there is no issue.

WARNING : In MEM Courrier parameters, set QRCodePrefix to 1 instead of 0

Now it's possible to send attachments with QR Code Separation. If you have a resume and a motivation letter, start with MEM Courrier entity Separation QR Code, then the resume. Add the PJ_SEPARATOR.pdf and then the motivation letter. In MEM Courrier you'll have the resume as principal document and the motivation letter as attachment.

Apache modifications

In case some big files would be sent, you have to increase the post_max_size parameter on the following file

/etc/php/7.X/apache2/php.ini

By default it is recommended to replace 8M by 20M or more if needed

Use AI

Open-Capture for MEM Courrier is using AI to detect some informations automatically. By now, you can retrieve MEM Courrier destination and type_id.

We can't provide an AI model because it's specific to each company. But we can help you to create yours, contact us.

API

Open-Capture for MEM Courrier integrate an API that allows you to directly send documents to MEM Courrier.

Configuration of the API

In order to the API to work, you need to set a robust secret_key in the config.ini file (automatically generated in the install process). This key will be used to authenticate the requests.

[API]
# Token expiration time in hours
token_expiration_time       = 1
secret_key                  = YOUR_ROBUST_SECRET_KEY

You can easily generate / regenerate a secret key, by running the following script :

cd /opt/edissyum/opencaptureformem/scripts/
chmod u+x regenerate_secret_key.sh
./scripts/regenerate_secret_key.sh

You also need to specify the custom_id and the config_file_path in the custom.json file.

[
  {
    "custom_id": "opencaptureformem",
    "config_file_path": "/opt/edissyum/opencaptureformem/src/config/config.ini"
  }
]

Usage of the API

Get a token

You first need to get a token by calling the API with your secret_key and custom_id :

Curl Python
```bash curl \ -X POST \ -H "Content-Type: application/json" \ -d '{"secret_key": "YOUR_SECRET_KEY", "custom_id":"YOUR_CUSTOM_ID"}' \ http://YOUR_SERVER_URL/opencaptureformem/get_token ``` ```python import requests url = "http://YOUR_SERVER_URL/opencaptureformem/get_token" data = {"secret_key": "YOUR_SECRET_KEY", "custom_id": "YOUR_CUSTOM_ID"} headers = {"Content-Type": "application/json"} response = requests.post(url, json=data, headers=headers) print(response.json() if response.status_code == 200 else f"Erreur: {response.status_code} - {response.text}") ```

Then you'll get a token that you'll have to use in the next request.

Here are some possible responses :

Status Response
200 ```json { "token":"XXXXXXXX-XXXXXXXXX-XXXXXXXXX-XXXXXXXXX" } ```
400 ```json { "message":"Invalid secret key" } ```
500 Internal Server Error

Upload files

A request to the API to upload files will look like this :

Curl Python
```bash curl \ -X POST \ -H "Content-Type: application/json" \ -H "Authorization: Bearer GENERATED_TOKEN" \ -d '{ "files": [{"file_content": "BASE_64_FILE_CONTENT", "file_name": "FILE_NAME"}], "custom_id": "YOUR_CUSTOM_ID", "process_name": "YOUR_PROCESS_NAME" }' \ http://YOUR_SERVER_URL/opencaptureformem/upload ``` ```python import requests url = "http://YOUR_SERVER_URL/opencaptureformem/upload" data = { "files": [{"file_content": "BASE_64_FILE_CONTENT", "file_name": "FILE_NAME"}], "custom_id": "YOUR_CUSTOM_ID", "process_name": "YOUR_PROCESS_NAME" } headers = { "Authorization": "Bearer GENERATED_TOKEN", "Content-Type": "application/json" } response = requests.post(url, json=data, headers=headers) print(response.json() if response.status_code == 200 else f"Erreur: {response.status_code} - {response.text}") ```

Here are some possible responses :

Status Response
200 ```json { "message":"All files processed successfully" } ```
400 ```json { "message":"custom_id XXXX not found in custom.json" } ```
400 ```json { "message":"Each file must have a 'file_name' and 'file_content' key" } ```
500 Internal Server Error

Get process list

A request to the API to get the list of available processes will look like this :

Curl Python
```bash curl \ -X POST \ -H "Content-Type: application/json" \ -H "Authorization: Bearer GENERATED_TOKEN" \ -d '{ "custom_id": "YOUR_CUSTOM_ID" }' \ http://YOUR_SERVER_URL/opencaptureformem/get_process_list ``` ```python import requests url = "http://YOUR_SERVER_URL/opencaptureformem/get_process_list" data = { "custom_id": "YOUR_CUSTOM_ID" } headers = { "Authorization": "Bearer GENERATED_TOKEN", "Content-Type": "application/json" } response = requests.post(url, json=data, headers=headers) print(response.json() if response.status_code == 200 else f"Erreur: {response.status_code} - {response.text}") ```

Here are some possible responses :

Status Response
200 ```json { "processes":["incoming","reconciliation_default","reconciliation_found"] } ```
400 ```json { "message":"Invalid or expired token" } ```
500 Internal Server Error

LICENSE

Open-Capture for MEM Courrier is released under the GPL v3.