Can't reproduce Pk on Choi's dataset

AleksTk commented 6 years ago

Hi,

I can't reproduce the reported Pk of 26.26 on Choi's dataset with the pre-trained model. When I run the default evaluation script

python test_accuracy.py --cuda --model model_gpu.t7

I get Pk of only 0.3667. What can be wrong here?

Running with threshold: 0.4 Loading word2vec ellapsed: 55.1196949482 seconds running on Choi ... 2018-10-29 13:41:24,821 - INFO - Finished testing. 2018-10-29 13:41:24,821 - INFO - Average loss: 0.0 2018-10-29 13:41:24,821 - INFO - Average accuracy: 0.8987517337031901 2018-10-29 13:41:24,821 - INFO - Pk: 0.3667. 2018-10-29 13:41:24,821 - INFO - F1: 0.3511. Seconds to execute to whole flow: 93.9438998699

koomri commented 6 years ago

You should execute the test_accuracy_choi.py script - it is performing CV over the dataset. For that you will need a version of Choi dataset which is splitted into 5 sections (for cross-validation). I can supply you with that.

Another easier option is to run test_accuracy.py script with the adjusted threshold for Choi dataset: 0.1

On Mon, Oct 29, 2018 at 3:48 PM Alexander Tkachenko < notifications@github.com> wrote:

Hi,

I can't reproduce the reported Pk of 26.26 on Choi's dataset with the pre-trained model. When I run the default evaluation script

python test_accuracy.py --cuda --model model_gpu.t7

I get Pk of only 0.3667. What can be wrong here?

Running with threshold: 0.4 Loading word2vec ellapsed: 55.1196949482 seconds running on Choi ... 2018-10-29 13:41:24,821 - INFO - Finished testing. 2018-10-29 13:41:24,821 - INFO - Average loss: 0.0 2018-10-29 13:41:24,821 - INFO - Average accuracy: 0.8987517337031901 2018-10-29 13:41:24,821 - INFO - Pk: 0.3667. 2018-10-29 13:41:24,821 - INFO - F1: 0.3511. Seconds to execute to whole flow: 93.9438998699

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/koomri/text-segmentation/issues/6, or mute the thread https://github.com/notifications/unsubscribe-auth/AE0tZoRdVL0n_7xcJ_mCT-Oa7V_issIdks5upwc3gaJpZM4X_Nj3 .

AleksTk commented 6 years ago

Another easier option is to run test_accuracy.py script with the adjusted threshold for Choi dataset: 0.1

Thank you. It worked.

koomri / text-segmentation

Can't reproduce Pk on Choi's dataset #6