fossology / atarashi

Atarashi scans for license statements in open source software, focusing on text statistics. Designed to work stand-alone and with FOSSology.
http://fossology.github.io/atarashi
GNU General Public License v2.0
26 stars 23 forks source link

perf(evaluator.py): Reduced evaluation time by using multiprocessing #76

Closed Aman-Codes closed 3 years ago

Aman-Codes commented 3 years ago

Description

Reduced the evaluation time on running evaulator.py by using multiprocessing

Performance Report *

Agent Name Time elapsed without using multiprocessing (in second) Time elapsed using multiprocessing (in second)
Word Frequency Similarity 94.93 44.15
tfidf 682.89 757.59
Ngram 711.2 115.71
DLD 1021.62 526.53

* The time taken also depend on hardware specification and may vary on different computers

Complete Report can be found here

Closes #64

GMishx commented 3 years ago

The changes seems to break tqdm progress bar. Can you please try to fix it? Otherwise it can be removed.

Aman-Codes commented 3 years ago

The changes seems to break tqdm progress bar. Can you please try to fix it? Otherwise it can be removed.

Fixed the progress bar