Closed nicenoize closed 5 years ago
Hi Nicenoize,
Thanks for taking interest in our work. First of all, if you'd uncommented line 66, and commented out line 67, then you should be seeing diagnosis codes such as "D_401.9". Otherwise you should be seeing codes such as "D_401".
Due to typos inherent to the raw data of MIMIC-III, there is a chance you might see both D_001 and D_01, or even D_1. This requires some extra data preprocessing.
I suggest you pick the model from earlier epochs and compare with the one from epoch 999. Theoretically, you want to choose the model from the epoch where the accuracy is close to 0.5, since that is when the discriminator is most confused (i.e. the generator makes the most convincing synthetic samples).
Best, Ed
Hello Ed, thank you for your quick response!
For the ICD9 Codes, changing line 68 to dxStr = 'D_' + convert_to_icd9(tokens[4])
did the job for me.
Choosing a model with accuracy closer to 0.5 led to significantly improved samples!
Thank you for your help!
Hello, after generating 10.000 patients I ran into two problems and I hope somebody can help with them:
Also the code D_1 appears which should be the same as D_01 if I'm correct.
This data seems quite random, do I maybe need to alter the hyperparameters or did I do something wrong with the training?
I am using the MIMIC-III dataset and followed the guide from the README, I used checkpoint -999 for generating the data.
Here's my training log: