smt-HS / I-SEE

MIT License
1 stars 0 forks source link

best model #2

Open shunjiu opened 2 years ago

shunjiu commented 2 years ago

hello, how to evaluate the model? I use the test command from GDPL, got low success rate.

the command and outputs are below.

python main.py --test True --load model_rl/best > result.txt DEBUG:root:Namespace(anneal=5000, batchsz=32, batchsz_traj=1024, clip=0.02, config='multiwoz', data_dir='data', ensemble_size=5, epoch=32, epsilon=0.2, gamma=0.99, load='model_rl/best', load_user='model/best', log_dir='log', lr_irl=0.001, lr_rl=0.0003, lr_simu=0.001, model_horizon=5, pretrain=False, pretrain_world=False, print_per_batch=400, process=16, save_dir='model', save_per_epoch=1, sim_ratio=1.0, simulator='agenda', tau=0.95, test=True, update_round=5) INFO:root:Load processed data file DEBUG:root:test INFO:root:Loading goal model is done INFO:root:Loading goal model is done INFO:root:<> loaded checkpoint from file: model_rl/best_estimator.mdl INFO:root:<<world model 0>> loaded checkpoint from file: model_rl/best_wm_0.pol.mdl INFO:root:<<world model 0>> loaded checkpoint from file: model_rl/best_wm_0.ter.mdl INFO:root:<<world model 1>> loaded checkpoint from file: model_rl/best_wm_1.pol.mdl INFO:root:<<world model 1>> loaded checkpoint from file: model_rl/best_wm_1.ter.mdl INFO:root:<<world model 2>> loaded checkpoint from file: model_rl/best_wm_2.pol.mdl INFO:root:<<world model 2>> loaded checkpoint from file: model_rl/best_wm_2.ter.mdl INFO:root:<<world model 3>> loaded checkpoint from file: model_rl/best_wm_3.pol.mdl INFO:root:<<world model 3>> loaded checkpoint from file: model_rl/best_wm_3.ter.mdl INFO:root:<<world model 4>> loaded checkpoint from file: model_rl/best_wm_4.pol.mdl INFO:root:<<world model 4>> loaded checkpoint from file: model_rl/best_wm_4.ter.mdl INFO:root:<

> loaded checkpoint from file: model_rl/best_ppo.val.mdl INFO:root:<> loaded checkpoint from file: model_rl/best_ppo.pol.mdl INFO:root:reward 6.87433197752332 INFO:root:turn 4.173 INFO:root:match 0.19026246719160103 INFO:root:inform rec 0.7077494995710609, F1 0.23414218816517668 INFO:root:success 0.182

smt-HS commented 2 years ago

Hi,

We didn't use the best model selected by GDPL as we don't think it is the right way to select the best-performing model for our method. If you check your model_rl directory, you should see several models, each is saved after training one epoch. We measure the performance of all of them and report the best of them.

zzcccci commented 4 months ago

I have the same question,evaluate result got low success rate