Open shenassa opened 2 years ago
Hello, what's your testing command and did you see the rouge scores in your logging file?
my testing command:
python -W ignore z_train.py \
-task abs \
-mode test \
-batch_size 1 \
-test_batch_size 1 \
-bert_data_path $BERT_DATA_PATH \
-log_file logs/test.logs \
-test_from $MODEL_PATH \
-sep_optim true \
-use_interval true \
-visible_gpus 0 \
-max_pos 512 \
-max_length 18 \
-alpha 0.95 \
-min_length 3 \
-result_path $RESULT_PATH
I am getting ROUGE score 0 and also getting this error at the end of test:
File "z_train.py", line 138, in <module>
test_abs(args, device_id, cp, step)
File "/home/jovyan/shenasa/guided_summarization/bert/z_train_abstractive.py", line 234, in test_abs
predictor.translate(test_iter, step)
File "/home/jovyan/shenasa/guided_summarization/bert/models/predictor.py", line 193, in translate
rouges = self._report_rouge(gold_path, can_path)
File "/home/jovyan/shenasa/guided_summarization/bert/models/predictor.py", line 202, in _report_rouge
results_dict = test_rouge(self.args.temp_dir, can_path, gold_path)
File "/home/jovyan/shenasa/guided_summarization/bert/others/utils.py", line 84, in test_rouge
rouge_results = r.convert_and_evaluate()
File "/home/jovyan/shenasa/guided_summarization/bert/others/pyrouge.py", line 398, in convert_and_evaluate
rouge_output = self.evaluate(system_id, rouge_args)
File "/home/jovyan/shenasa/guided_summarization/bert/others/pyrouge.py", line 368, in evaluate
self.write_config(system_id=system_id)
File "/home/jovyan/shenasa/guided_summarization/bert/others/pyrouge.py", line 349, in write_config
Rouge155.write_config_static(
which logically happens because of empty result file.
Could you set batch_size and test_batch_size to 3000 and 1500 as in the provided script and see if it works? Here 'batch_size' is not the number of instances (see here for details).
Also, you may need a larger min_length and max_length as in the provided script since here the length is the length of subwords. PreSumm set them to 20 and 100 for XSum, 50 and 200 for CNNDM.
First of all, thank you for the code you provided. I downloaded original test and train data from PreSumm repo and then add its corresponding guidance signal through ‘highligted_sentence_data.py’ from bert folder. Everything goes well in training. But at test time, i get empty results. What’s wrong here?