baowaly / SynthEHR

37 stars 17 forks source link

Quantitative Evaluation #1

Open nicenoize opened 4 years ago

nicenoize commented 4 years ago

Hello, for an investigation of medGAN, I'm trying to recreate your evaluation process for medGAN. Especially the K-S test, association rule mining or logistic regression.

  1. For the K-S test I iterate over the columns (disease codes) of the real and the synthetic dataset. I then receive two values: statistic and pvalue. How do I compute those to get to the similarity?

  2. I also want to check, how well the model trained interdimensional relationships. I think both versions should be usable for this, with your version being more detailed than logistic regression. Is there any code publicly available for this? I'm thankful for any help!

vampypandya commented 4 years ago

@nicenoize I am also looking in the same part. Could you please explain me why are we comparing the columns of real and synthetic data? K-S test find the level of similarity, or the degree by which we can determine if the distributions originated from the same parent distribution. I am not sure the column comparison would be a appropriate measure.

Let me know if I am going wrong.

Harry-KIT commented 3 years ago

Hi brothers, maybe you know data preparation? "vampypandya", "nicenoize"?

matak07 commented 2 years ago

Is it possible for anyone to share the K-S test codes, please? I am unable to get the results as mentioned in the paper.