This is the official implementation of the paper "Large Language Model Distilling Medication Recommendation Model".
You can implement our model according to the following steps:
resources/llama-7b/
.data/mimic3/raw/
and data/mimic4/
, respectively. Then, run the scripts construction.ipynb
under data/mimic3/
and data/mimic4/
to preprocess the data. The preprocessed data will be saved under mimic3/handled/
and mimic4/handled/.
Besides, the file to convert ATC code to drug name is available from this link, i.e., "WHO ATC-DDD 2021-12-03.csv". Other auxiliay files, such as "drug-DDI.csv" can be otained from the repo of GAMENet and SafeDrug.Install the necessary packages. Run the command:
pip install -r requirements.txt
First, train the large language model for medication recommendation via the command:
bash experiments/llm_cls.bash
Then, you can run the knowledge distillation via the following command:
bash experiments/mimic3/online_distill.bash
bash experiments/mimic4/online_distill.bash
For the long running time of distillation, we can save the hidden states from LLM previously. You can run the test on the train file, and the hidden states will be saved in the results automatically vias our llm_cls.bash
. Then, put the results file into mimic3/handled/
or mimic4/handled/
, then run the KD within two hours!
bash experiments/mimic3/offline_distill.bash
bash experiments/mimic4/offline_distill.bash
If the code and the paper are useful for you, it is appreciable to cite our paper:
@article{liu2024large,
title={Large Language Model Distilling Medication Recommendation Model},
author={Liu, Qidong and Wu, Xian and Zhao, Xiangyu and Zhu, Yuanshao and Zhang, Zijian and Tian, Feng and Zheng, Yefeng},
journal={arXiv preprint arXiv:2402.02803},
year={2024}
}
The code refers to the repo MOELoRA-peft, GAMENet and SafeDrug.