snuspl / cruise

Cruise: A Distributed Machine Learning Framework with Automatic System Configuration
Apache License 2.0
26 stars 2 forks source link

[CAY-1233] Offline model evaluation #1234

Closed wynot12 closed 7 years ago

wynot12 commented 7 years ago

Closes #1233

This PR implements the offline evaluation of a trained model.

At the end of every epoch, master checkpoints a model table. Then after finishing a training, all checkpoints are restores and evaluated.

By setting -offline_model_eval as true, we can turn on this feature.

wynot12 commented 7 years ago

The test will pass with cmssnu/elastic-tables#216.

wynot12 commented 7 years ago

!rebuild

wynot12 commented 7 years ago

!rebuild

wynot12 commented 7 years ago

It's the core part of our NSDI work. I'll test it in our optiplex cluster now.

wynot12 commented 7 years ago

I've confirmed that it works well in our optiplex cluster. By setting `-offline_model_eval' as true, we can turn on this feature.

@yunseong would you please take a look? Thanks!

yunseong commented 7 years ago

Sure, I will take a look.

wynot12 commented 7 years ago

It looks that this PR includes the fix of LDA model inconsistency. I'll send a commit to exclude them.

wynot12 commented 7 years ago

I've resolved your comments and updated additional minor things. Please take a look :) Thanks.