baidu / Dialogue

444 stars 97 forks source link

test result running pre-trained model for douban data is far lower than published result #28

Closed Sjzwind closed 5 years ago

Sjzwind commented 5 years ago

Hi, This is result in output/douban/DAM/result(sam as published paper) 0.550061141847 0.600620002387 0.427067669173 0.25380833035 0.410119345984 0.756952500298

below is result I using pre-trained model output/douban/DAM/DAM.ckpt 0.54090909090909 0.1348484848486 0.25 0.566666666667

I wonder why there is so largely difference between the two results, as they should have been similar.

Thanks.

xyzhou-puck commented 5 years ago

Hi,

sorry for the late replying, today is crazy busy.

what do you mean by pre-trained model?

Xiangyang

xyzhou-puck commented 5 years ago

Hi,

Sorry, I was just a little bit tired today, you may need to change the code to fit Douban corpus, you can see details in our code.

Xiangyang

Sjzwind commented 5 years ago

Thanks for you reply, I just want to reproduce the published result using the published pre-trained model and main.py script. But I found the the result what I got(similar result with #15 by @zysNLP) is far lower than published result. below is the main.py script:

`conf = { "data_path": "./data/douban/data.pkl", "save_path": "./output/douban/DAM_test/", "word_emb_init": "./data/douban/word_embedding.pkl", "init_model": "./output/douban/DAM/DAM.ckpt", #should be set for test

"rand_seed": None, 

"drop_dense": None,
"drop_attention": None,

"is_mask": True,
"is_layer_norm": True,
"is_positional": False,  

"stack_num": 5,  
"attention_type": "dot",

"learning_rate": 1e-3,
"vocab_size": 172130, # 172130 for douban data, 434512 for ubuntu data
"emb_size": 200,
"batch_size": 200, #200 for test, 256 for train

"max_turn_num": 9,  
"max_turn_len": 50, 

"max_to_keep": 1,
"num_scan_data": 2,
"_EOS_": 1, #1 for douban data, 28270 for ubuntu data
"final_n_class": 1,

}`

I didn't know what is the reason as I have seen and didn't change the code. Maybe the config influence result significantly. Could you give the config for the best model published.

Thanks.

xyzhou-puck commented 5 years ago

Hi,

The configuration is in the code, you may need to search using keyword "douban" to find which part shall be changed.

Xiangyang

Sjzwind commented 5 years ago

Thanks for your reply sincerely!

I used the wrong evaluation not douban_evaluation and this is where mistake is.

zysNLP commented 5 years ago

Hi! My latest result in document "result.test" is: 0.508324761497 0.553556510323 0.362406015038 0.205111588495 0.370409356725 0.734894975534 Here is my configuration:

import cPickle as pickle import tensorflow as tf import numpy as np

import utils.reader as reader import models.net as net

import utils.evaluation as eva

for douban

import utils.douban_evaluation as eva

import bin.train_and_evaluate as train import bin.test_and_evaluate as test

configure

conf = { "data_path": "./data/douban/data.pkl", "save_path": "./output/douban/DAM_train/", "word_emb_init": "./data/douban/word_embedding.pkl",

"init_model": None, #should be set for test

"init_model":"./output/douban/DAM_train/model.ckpt", #should be None for train

"rand_seed": None, 

"drop_dense": None,
"drop_attention": None,

"is_mask": True,
"is_layer_norm": True,
"is_positional": False,  

"stack_num": 5,  
"attention_type": "dot",

"learning_rate": 1e-3,
"vocab_size": 172130, #434512,
"emb_size": 200,
"batch_size": 32, #32 for train
#"batch_size": 32, #200 for test

"max_turn_num": 9,  
"max_turn_len": 50, 

"max_to_keep": 1,
"num_scan_data": 2,
"_EOS_": 1, #28270 for ubuntu data
"final_n_class": 1,

}

model = net.Net(conf)

train.train(conf, model)

test and evaluation, init_model in conf should be set

test.test(conf, model)

where the "DAM_train/model.ckpt" was trained by myself. I suggest you trained it again to get new trained-results then use them to test. Besides, When tested, I use import utils.douban_evaluation as eva and pay attention to the issue I raised the other day that "There is someting wrong". You should modify the "test_and_evaluate.py" Hope this helpful for you!

Sjzwind commented 5 years ago

Thanks, your reply helps a lot! I use evaluation not douban_evaluation wrongly and I think this is where mistake is.

It seems that your new trained-results is lower about 5 percent. Do you changed code or had a pytorch version? and do you train the word embedding by yourself?

Thanks again.

zysNLP commented 5 years ago

No, I didn' t change code or make a pytorch version.If you want to make it? Add my qq:648634000 we can communicate further

Sjzwind commented 5 years ago

okay, I'll add you, my friend, haha.