mp2893 / med2vec

Repository for Med2Vec project
BSD 3-Clause "New" or "Revised" License
220 stars 74 forks source link

high training cost #18

Open hsohn3 opened 5 years ago

hsohn3 commented 5 years ago

Hi Edward, While I was searching for a new research idea, I've found your model and it was interesting in that it can learn code- and visit-level representation from EHRs simultaneously.

Using your model, I'm trying to learn embeddings that can represent measurements other than medical codes. However, the training cost seems quite high (around 150~250) and it doesn't converge(or go below 1) just like other models. I've found that the others have the same range of cost, but I wonder what was the final cost at the end of training.

Is it natural for this model to have this kind of high cost at the end of training? or is something wrong with a setting? I've adjusted the parameters in the model, but cost 170 was the best I could get.

I would appreciate your help.

mp2893 commented 5 years ago

Hi,

Thanks for taking interest in med2vec. To be honest, I don't remember what the loss values looked like, since it's been almost 3 years. But when you say that you want to learn the embeddings for measurements, rather than medical codes, I'm not sure how you are training med2vec. Med2vec has a skip-gram loss which is basically co-occurrence counting, but I'm not sure how you would do that with continuous measurements. Maybe you could provide more info?

Best, Ed

hsohn3 commented 5 years ago

Thank you for your comment. I've sent you an email(mp2893@gatech.edu) about more detailed information.