matt-gardner / pra

122 stars 42 forks source link

Bug in ExperimentRunner #5

Closed lkq1992yeah closed 9 years ago

lkq1992yeah commented 9 years ago

Hi Matt,

I found a bug when I'm running ExperimentRunner with the experiment specification under _examples/experiment_specs/nell/final_emnlp2015/sfe_bfs_praanyrel.json. Below is a snapshot of the output score file of relation riverflowsthroughcity:

concept:river:mfolozi_river concept:river:mfolozi_river -3.203326821678852

concept:river:yangtze concept:river:yangtze 3.1706911223570553 concept:river:yangtze concept:river:yangtze 3.1706911223570553

concept:river:rio_grande concept:river:rio_grande 0.9799970081086755 concept:river:rio_grande concept:river:rio_grande 0.9799970081086755 concept:river:rio_grande concept:river:rio_grande 0.9799970081086755

There has 1823 rows, which is the same as the size of testing file. However, in each row, the source always equals to target, which I think might be a bug. Then I searched the code, and find the function lineToInstance in Dataset.scala, which seems to be used for reading training data and testing data. And I found this:

val source = graph.getNodeIndex(fields(0))
val target = graph.getNodeIndex(fields(0))      <----- I think it should be fields(1)

Is it a bug, or maybe I downloaded some old version of the code? Thanks!

matt-gardner commented 9 years ago

That is definitely a bug, and it got me worried about the results in my paper. But, I checked my results, and they do not have this problem. Looking at git blame, that bug was introduced on July 17, after all of the experiments for the recent EMNLP paper were run, and I've been using the lineToInstanceAndGraph method (which doesn't have this bug) on other experiments since that time, so I didn't notice the bug. Thanks for finding it.

Feel free to change that line in your version of the code; I'm going to write a test to catch this and upload a fix to the repository, probably tomorrow.

lkq1992yeah commented 9 years ago

That's OK, and don't worry about your results, since I've successfully reproduced your experiment. It shows that MRR=0.9333 (identical to the paper's), and MAP=0.5253, which is slightly lower than 0.528 written in paper, but totally acceptable :)

matt-gardner commented 9 years ago

Well, it's encouraging that you were actually able to reproduce the result. Did you end up having to change any paths, or are there any other instructions that I should put on the paper website?

Thanks again for your help with finding a couple of bugs. It's nice to have someone else who knows what they're doing try to run the code.

lkq1992yeah commented 9 years ago

In fact I didn't change any paths either in the codes or in the specifications. I just ran sfe_bfs_pra_anyrel.json, and I think the other experiments in final_emnlp2015/ would work as well.

In terms of instructions, maybe you can write sbt "run SOME_BASE_DIR sfe_bfs_pra_anyrel", which can help users who just want to quickly reproduce the state-of-the-art result. And also you can add some instructions about how to change build.sbt for jvm configurations.

matt-gardner commented 9 years ago

Thanks!