nlpyang / PreSumm

code for EMNLP 2019 paper Text Summarization with Pretrained Encoders
MIT License
1.29k stars 465 forks source link

I got less records from test mode #206

Open cheop-byeon opened 3 years ago

cheop-byeon commented 3 years ago

I am using the test mode to get some results on my own data, and I preprocess my data the same way described in those steps, after I run the command as below python train.py -task ext -mode test -batch_size 3000 -test_batch_size 500 -bert_data_path ../bert_data/my_data -log_file ../logs/test_ext_bert_my-data -sep_optim true -use_interval true -max_pos 512 -max_length 200 -min_length 50 -result_path ../results/ext_bert_my_data -test_from ../models/bertext_cnndm_transformer.pt for 15798 records of data, the results are 15730 records, 99% of my original data, can you inform me why it happened and how to get exactly the same number of records?

[2021-02-17 23:25:39,791 INFO] Loading test dataset from ../bert_data/my_data.test.0.bert.pt, number of examples: 2001 [2021-02-17 23:39:16,370 INFO] Loading test dataset from ../bert_data/my_data.test.1.bert.pt, number of examples: 2001 [2021-02-17 23:52:27,877 INFO] Loading test dataset from ../bert_data/my_data.test.2.bert.pt, number of examples: 2001 [2021-02-18 00:05:30,122 INFO] Loading test dataset from ../bert_data/my_data.test.3.bert.pt, number of examples: 2001 [2021-02-18 00:18:53,545 INFO] Loading test dataset from ../bert_data/my_data.test.4.bert.pt, number of examples: 2001 [2021-02-18 00:32:11,453 INFO] Loading test dataset from ../bert_data/my_data.test.5.bert.pt, number of examples: 2001 [2021-02-18 00:45:35,855 INFO] Loading test dataset from ../bert_data/my_data.test.6.bert.pt, number of examples: 2001 [2021-02-18 00:58:21,746 INFO] Loading test dataset from ../bert_data/my_data.test.7.bert.pt, number of examples: 1723