Systematic Literrature Review assisted with IA
Before installing the project, it is recomended to create or use a virtual env using venv or conda virtual environment. run the following commands :
chmod +x install.sh
chmod +x main.py
./install.sh
pip install -r requirements.txt
If you want to install it globally and have a cli you can use the following commands :
sudo mkdir /usr/bin/slria
sudo cp *.py /usr/bin/slria/
sudo cp empty.conf /usr/bin/slria
echo 'alias slria="python3 /usr/bin/slria/main.py"' >> ~/.zshrc
To use with Docker
sudo docker build -t "slria" .
mkdir TUTO_DOCKER_SLRIA
sudo docker run -dit --name slria-tuto -v ./TUTO_DOCKER_SLRIA/:/home/ slria
sudo docker exec -it slria-tuto /bin/bash
Once in the docker cd in /home then you can start to use the slria command line.
slria init ./SLRIA_TUTO
to create a folder called SLRIA_TUTO. This folder will contains one config faile SLR.conf and two folders : csv and final bib.
Fill the SLR.conf file. Add all the requests that will be made to each search engine. Provide one request per ligne such as : [Request]
r0 = "Personal Identification with PPG"
r1 = "Personal recognition with PPG"
r2 = "Signature with PPG"
r3 = "biometric identification with photoplethysmography"
r4 = "Personal Identification with photoplethysmography"
Then provide the value for the actual round, the number of paper you want per request and a name for each database. If you use the same database multiples times with differents times laps, it recommended to add the time laps at the end of the names.
[SLRIA]
round = 1
papers_by_request = 10
databases = scholar_17_20,scholar,pubmed,pubmed_17_20,web_of_science,web_of_science_17_20
The Filter section help to filter and classify the papers. It also help to merge duplicate references. To merge the references, we use a comparison score with a threshold. For each references, we extracted a maximum of fields. But due to differences in references format, sometimes fields are missings. This is why we made a matching algorithm based on all the fields. We give 40 points if the title is the same, and 10 pts per identical authors and 1 point for all other fields. To match the titles between two references, we use the damerau-levenshtein distance with a score below to 5. We add this flexibility to be resilient to the invisible characters and errors due to reference extract.
This values can be easly modified in the file reference.py.
To merge duplicate references, we compare all the references two by two and compute the comparison score. If the score is higher than the one given in the filter section, the two references are merged. Be carreful, this system is not perfect and requier a manual check.
The kwords are used to determine the level of interest of a paper. The number of keywords in the title determine the level of interest in the paper, so give a maximum of words in this section. Moreover, the nkwords fields allow to automatically drop all the papers where the title which contains less key wordsthan this value.
[Filter]
comparison = 35
appearance =
kwords = "biometric,identification,signal,personal,body,healthcare,photoplethysmographic,diagnostic,interindividual,qrs,authentication,physiological,signature,recognition,authentication,analysis,study,PPG,representation,photoplethysmography,identity,study,identifier,verification,novel"
nkwords = 1
slria request ./
to create a folder for each request with their meta.conf fileslria start ./
to merges all the papers, deletes duplicates paper and generate statistics for the 1st roundslria extract ./
to extract all the references from all the papers in their corresponding foldersslria merge ./
to merges all the references and produces statistics for the snowball