xmshi-trio / MSL

Other
22 stars 4 forks source link

NeuralBERTClassifier for Medical Slot Filling

Introduction

NeuralBERTClassifier is designed for quick implementation of neural models for multi-label classification problem: Medical Slot Filling (MSF). A salient feature is that NeuralBERTClassifier currently provides a variety of text encoders, such as FastText, TextCNN, TextRNN, RCNN, VDCNN, DPCNN, DRNN, AttentiveConvNet, Transformer encoder, and BERT etc. It also supports other text classification scenarios, including binary-class and multi-class classification. It is built on PyTorch. Corresponding paper Understanding Medical Conversations with Scattered Keyword Attention and Weak Supervision from Responses was accepted by AAAI 2020.

Notice

According to Tencent's regulations, the dataset can only be used for research purposes.

Support tasks

Support text encoders

Requirement

Usage

Training

python train.py conf/train.json

Detail configurations and explanations see Configuration.

The training info will be outputted in standard output and log.logger_file.

Evaluation

python eval.py conf/train.json

The evaluation info will be outputed in eval.dir.

Input Data Format

JSON example:

{
    "doc_label": ["Computer--MachineLearning--DeepLearning", "Neuro--ComputationalNeuro"],
    "doc_token": ["I", "love", "deep", "learning"],
    "doc_keyword": ["deep learning"],
    "doc_topic": ["AI", "Machine learning"]
}

"doc_keyword" and "doc_topic" are optional.

Update