bvignau / SLRIA

Systematic Literrature Review assisted with IA
MIT License
0 stars 0 forks source link

SLRIA

Systematic Literrature Review assisted with IA

Install

Before installing the project, it is recomended to create or use a virtual env using venv or conda virtual environment. run the following commands :

chmod +x install.sh
chmod +x main.py

./install.sh
pip install -r requirements.txt

If you want to install it globally and have a cli you can use the following commands :

sudo mkdir /usr/bin/slria
sudo cp *.py /usr/bin/slria/
sudo cp empty.conf /usr/bin/slria
echo 'alias slria="python3 /usr/bin/slria/main.py"' >> ~/.zshrc

To use with Docker

sudo docker build -t "slria" .
mkdir TUTO_DOCKER_SLRIA
sudo docker run -dit --name slria-tuto -v ./TUTO_DOCKER_SLRIA/:/home/ slria
sudo docker exec -it slria-tuto /bin/bash

Once in the docker cd in /home then you can start to use the slria command line.

HOW TO USE

[Request]
r0 = "Personal Identification with PPG"
r1 = "Personal recognition with PPG"
r2 = "Signature with PPG"
r3 = "biometric identification with photoplethysmography"
r4 = "Personal Identification with photoplethysmography"

Then provide the value for the actual round, the number of paper you want per request and a name for each database. If you use the same database multiples times with differents times laps, it recommended to add the time laps at the end of the names.

[SLRIA]
round = 1
papers_by_request = 10
databases = scholar_17_20,scholar,pubmed,pubmed_17_20,web_of_science,web_of_science_17_20

The Filter section help to filter and classify the papers. It also help to merge duplicate references. To merge the references, we use a comparison score with a threshold. For each references, we extracted a maximum of fields. But due to differences in references format, sometimes fields are missings. This is why we made a matching algorithm based on all the fields. We give 40 points if the title is the same, and 10 pts per identical authors and 1 point for all other fields. To match the titles between two references, we use the damerau-levenshtein distance with a score below to 5. We add this flexibility to be resilient to the invisible characters and errors due to reference extract.

This values can be easly modified in the file reference.py.

To merge duplicate references, we compare all the references two by two and compute the comparison score. If the score is higher than the one given in the filter section, the two references are merged. Be carreful, this system is not perfect and requier a manual check.

The kwords are used to determine the level of interest of a paper. The number of keywords in the title determine the level of interest in the paper, so give a maximum of words in this section. Moreover, the nkwords fields allow to automatically drop all the papers where the title which contains less key wordsthan this value.

[Filter]
comparison = 35
appearance = 
kwords = "biometric,identification,signal,personal,body,healthcare,photoplethysmographic,diagnostic,interindividual,qrs,authentication,physiological,signature,recognition,authentication,analysis,study,PPG,representation,photoplethysmography,identity,study,identifier,verification,novel"
nkwords = 1