Rensvandeschoot / software-overview-machine-learning-for-screening-text

The repository aims to create an overview and comparison of software used for systematically screening large amounts of textual data using machine learning.
Creative Commons Attribution 4.0 International
11 stars 7 forks source link

EPPI-Reviewer: missing information #21

Open gimoAI opened 2 years ago

gimoAI commented 2 years ago

There are a couple of features/properties about EPPI-reviewer that I was unable to find in literature and/or documentation:

A lot of documentation can be found in the EPPI-Reviewer v4.8 manual

Tanja19zpid commented 1 year ago

Regarding task 4: In the documentation on machine learning in EPPI (https://eppi.ioe.ac.uk/CMS/Portals/35/machine_learning_in_eppi-reviewer_v_7_web_version.pdf) it says: "The algorithm we use is a support vector machine as implemented in the Scikit-Learn Python machine library." The SVM classifiers in this package seem to be able to handle imbalanced data (https://scikit-learn.org/stable/modules/generated/sklearn.metrics.balanced_accuracy_score.html), which might also be helpful for task 1.

Rensvandeschoot commented 1 year ago

good point! SVM can handle imbalanced data by introducing different weights to the classes or using different cost-sensitive learning techniques. You can use the "class_weight" parameter available in Scikit-learn's SVM implementation. By setting this parameter to "balanced," the algorithm automatically adjusts the weights inversely proportional to the class frequencies. But do we know if this is the case in EPPI reviewer?