Open wenwxt opened 3 weeks ago
Without having access to the exact data underlying the training, I can only give some high-level suggestions. First, when you refer to the “performance”, do you mean the accuracy of the clinician's action prediction (not reported in the paper; the training procedure is described in App. C3), or the baseline reported in the plots (gray, dashed line)? In the latter case, please refer to Sec. 5.1 (baselines): we report the observed reward from the dataset, not the reward from the approximate clinician's policy; the learned policy is only used for WIS evaluation. Next, please make sure that the data has been pre-processed in a way the README file describes. Lastly, it might be useful to take a look at the paper “An Empirical Study of Representation Learning for Reinforcement Learning in Healthcare”, as the implementation of the clinician's policy has been inspired by it.
I’m trying to train the clinician model based on the provided code ,but the model seems to struggle with learning — the performance is significantly below the reported results, and it appears that the model is not improving during training.