kenchan0226 / keyphrase-generation-rl

Code for the ACL 19 paper "Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards"
https://arxiv.org/abs/1906.04106
MIT License
107 stars 15 forks source link

ndcg_array = dcg_array / dcg_max_array #10

Open SaidaSaad opened 4 years ago

SaidaSaad commented 4 years ago

Hello i would like to know after we use the command for computing the evaluation scores on prediction files there is any explanation for that ? I got that erros RuntimeWarning: invalid value encountered in true_divide ndcg_array = dcg_array / dcg_max_array RuntimeWarning: invalid value encountered in true_divide alpha_ndcg_array = alpha_dcg_array / alpha_dcg_max_array henlo henlo henlo

Thanks

kenchan0226 commented 4 years ago

It sounds like some of the values in dcg_max_array are zero or NaN. Can you please wrap the ndcg_array = dcg_array / dcg_max_array line into a try execpt block and print out the values of dcg_max_array? If some of the values are zero, maybe you try to do the division in an element-wise way and set the value to zero if dcg_max_array[i]=0.

SaidaSaad commented 4 years ago

Yes I printed dcg_max_array and alpha_dcg_max_array and some of the valuse are zeros. so I edited that that line alpha_ndcg_array = alpha_dcg_array/ alpha_dcg_max_array to be

alpha_ndcg_array = np.zeros(shape=(2,)) for i in range(2): if alpha_dcg_max_array[i] == 0 and np.isnan(alpha_dcg_max_array[i]): alpha_ndcg_array[i] = 0 else: alpha_ndcg_array[i] = alpha_dcg_array[i] / alpha_dcg_max_array[i] alpha_ndcg_array = np.nan_to_num(alpha_ndcg_array)

I did also the same for ndcg_max_array , I got no error but it just print

henlo henlo henlo

and nothing else. Did i do something wrong ?

Thanks

SaidaSaad commented 4 years ago

This what i got 22

kenchan0226 commented 4 years ago

Hi, I double-checked the code. The script works even some of the values are zeros, it will just print out a warning, but not an error. The np.nan_to_num(ndcg_array) in the original code have already handled the issue of division by zero. I think you can simply use the original evaluate_prediction.py in this github and try it. It should not print out "henlo" since I cannot find any "henlo" in the entire project.

SaidaSaad commented 4 years ago

Yes , the original code give me the first error i mentioned in the first question as you can see here 545

SaidaSaad commented 4 years ago

I fixed that by what i told you in the last comment Please let me know if there is something wrong i did

I also would like to know where i can find the scores for the evaluation ?

Thanks

kenchan0226 commented 4 years ago

Yes, I know, but my point is the division by zero is just a warning, but not an error, it will not cause the script to terminate, and the NaN value will be corrected by np.nan_to_num(ndcg_array). So you don't need to fix the issue of division by zero. Can you use the original code and use the -exp_path option to specify a path. Run the script, then it will create a results_log_*.txt file in the directory specified by the -exp_path. I am sorry that I did not make it clear in the readme. If you simply use the default arguments in our readme, it will create a results_log_*.txt file in exp/kp20k.[timestamp] folder. Actually, the value of the -exp_path argument printed out in your screen will show you path that stores the results file.

SaidaSaad commented 4 years ago

Thank you very much , yes , you are right , those were just warning Now Yes I got the result file , One more last question :)

I am computing the scores in for word_inspec test set prediction

I used this comman to compute the prediction -pred_file_path pred/predict.kp20k.one2many.cat.copy.bi-directional.20191212-151234/predictions.txt -trg_file_path data/cross_domain_sorted/word_inspec_testing_allkeywords.txt -src_file_path data/cross_domain_sorted/word_inspec_testing_context.txt -exp kp20k -export_filtered_pred -disable_extra_one_word_filter -invalidate_unk -all_ks 5 M -present_ks 5 M -absent_ks 5 M

In this results file, results_log_5_M_5_M_5_M.txt

Can you please explain it. I saw it dived to all, present, absent and MAE stat. Can you explain for example the one for Present and where exactly the final F1@5, F1@M and alph-nDCG@5 for present keyphrase. And what is Micro and Macro mean.

Thanks I appreciated your help

kenchan0226 commented 4 years ago

Hi, the results under ====all==== are the F1 scores for the prediction includes both present and absent keyphrases. The results under ====present==== are the F1 scores for present predicted keyphrases only. The results under ====absent==== are the F1 scores for absent predicted keyphreases only. The results under ====MAE==== are the results about the ability of the model to predict the correct number of keyphreases.

SaidaSaad commented 4 years ago

Yes I understood that but what I did not understand is that for example for present keyphrases I found two F1@5 F1@5=0.16388 (Micro) and F1@5=0.18448(Macro) what is the difference and which one you used to evaluate

==================================present====================================

predictions after filtering: 2375 #predictions after filtering per src:4.750

unique targets: 3602 #unique targets per src:7.204

Begin===============classification metrics present@5===============Begin

target: 2500, #predictions: 3602, #corrects: 500

Micro: P@5=0.2 R@5=0.13881 F1@5=0.16388 Macro: P@5=0.2 R@5=0.1712 F1@5=0.18448 Begin===============classification metrics present@M===============Begin

target: 2375, #predictions: 3602, #corrects: 511

Micro: P@M=0.21516 R@M=0.14187 F1@M=0.17099 Macro: P@M=0.27203 R@M=0.17369 F1@M=0.21201 Begin==================Ranking metrics present@5==================Begin MAP@5=0.13014 NDCG@5=0.49478 AlphaNDCG@5=0.7968 Begin==================Ranking metrics present@M==================Begin MAP@M=0.13103 NDCG@M=0.49963 AlphaNDCG@M=0.82678

kenchan0226 commented 4 years ago

Hi, we report the macro F1 scores in our paper since they are used by previous keyphrase generation literature. Macro F1 scores and micro F1 scores are two different ways to aggregate the F1 score of each individual test sample into one score. You can check the following two urls for the details.

http://rushdishams.blogspot.com/2011/08/micro-and-macro-average-of-precision.html https://datascience.stackexchange.com/questions/15989/micro-average-vs-macro-average-performance-in-a-multiclass-classification-settin

SaidaSaad commented 4 years ago

Hello Kenchan

I have another question,I would to ask when we are the prediction using this command 👍

catSeq on inspec dataset: python3 interactive_predict.py -vocab data/kp20k_sorted/ -src_file data/cross_domain_sorted/word_inspec_testing_context.txt -pred_path pred/%s.%s -copy_attention -one2many -one2many_mode 1 -model [path_to_model] -max_length 60 -remove_title_eos -n_best 1 -max_eos_per_output_seq 1 -beam_size 1 -batch_size 20 -replace_unk

why we shall pass the dateset itself -vocab data/kp20k_sorted/ in that command . I think it should be enough to pass the model and the Test data set only , So Could you please explain me more why it is necessary?

kenchan0226 commented 4 years ago

Our source code separately save the word2idx and idx2word dictionary in the vocab.pt, they are not inside our saved model, so we still need to load it.

SaidaSaad commented 4 years ago

Thanks for your reply , I would like to know if your source code is saving word2idx and idx2word during the training of the model? Another question In case if i got a model which trained on part of the dataset , Can I still be able use the same command to get the predictions? Thank you :)

kenchan0226 commented 4 years ago

word2idx and idx2word are saved by the preprocessing scripts. I think you can still use the same command to get the predictions.

Struggle-lsl commented 1 year ago

predict.txt can you run