The Multi Filter Residual Convolutional Neural Network (MultiResCNN) is built based on TextCNN, Residual Network and CAML. It could be used as a strong baseline model for text classification. The repo can be used to reproduce the results in the paper:
@inproceedings{li2020multirescnn,
title={ICD Coding from Clinical Text Using Multi-Filter Residual Convolutional Neural Network},
author={Li, Fei and Yu, Hong},
booktitle={Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence},
year={2020}
}
This repo mainly requires the following packages.
Full packages are listed in requirements.txt.
Our process of preparing data just follows CAML with slight modifications. Put the files of MIMIC III and II into the 'data' dir as below:
data
| D_ICD_DIAGNOSES.csv
| D_ICD_PROCEDURES.csv
└───mimic2/
| | MIMIC_RAW_DSUMS
| | MIMIC_ICD9_mapping
| | training_indices.data
| | testing_indices.data
└───mimic3/
| | NOTEEVENTS.csv
| | DIAGNOSES_ICD.csv
| | PROCEDURES_ICD.csv
| | *_hadm_ids.csv (get from CAML)
Run python preprocess_mimic3.py
and python preprocess_mimic2.py
.
Train and test using full MIMIC-III data
python main.py -data_path ./data/mimic3/train_full.csv -vocab ./data/mimic3/vocab.csv -Y full -model MultiResCNN -embed_file ./data/mimic3/processed_full.embed -criterion prec_at_8 -gpu 0 -tune_wordemb
Train and test using top-50 MIMIC-III data
python main.py -data_path ./data/mimic3/train_50.csv -vocab ./data/mimic3/vocab.csv -Y 50 -model MultiResCNN -embed_file ./data/mimic3/processed_full.embed -criterion prec_at_5 -gpu 0 -tune_wordemb
Train and test using full MIMIC-II data
python main.py -data_path ./data/mimic2/train.csv -vocab ./data/mimic2/vocab.csv -Y full -version mimic2 -model MultiResCNN -embed_file ./data/mimic2/processed_full.embed -criterion prec_at_8 -gpu 0 -tune_wordemb
If you want to use ELMo, add -use_elmo
on the above commands.
Train and test using top-50 MIMIC-III data and BERT
python main.py -data_path ./data/mimic3/train_50.csv -vocab ./data/mimic3/vocab.csv -Y 50 -model bert_seq_cls -criterion prec_at_5 -gpu 0 -MAX_LENGTH 512 -bert_dir <your bert dir>
We thank all the people that provide their code to help us complete this project.