allenai / ruletaker

Apache License 2.0
41 stars 6 forks source link

Changes to label generation output format, metrics reported and more #14

Closed sbhaktha closed 3 years ago

sbhaktha commented 3 years ago

@bhavanad, @aimichal: here are the main changes:

  1. In theory_label_generator: output json object did not have the right structure. Input json object has a collection of question objects associated with each example (theory) but output json object had the generated label outside of the question collection though each question was supposed to have a label associated with it. This is fixed.
  2. Exception Handling is simplified so that no label (None) is returned is an exception is returned.
  3. Metrics reporting is simplified/modified to report more pertinent information.
  4. In theory_label_generator added a flag and corresponding command line argument to control whether or not to include examples with mismatched labels (between gold label in input and generated theorem prover output) from the output jsonl file. The flag defaults to including them. While generating the new format of the original RuleTaker dataset, we had this flag set to False, so examples with problog labels not matching the original labels and examples that caused failures (problog exceptions) are excluded from the new dataset.
  5. TheoryAssertionRepresentationWithLabel structure has a new placeholder field for exception if one is thrown while running the theorem prover through an example.
  6. Fixes to README.
sbhaktha commented 3 years ago

@aimichal : I made some changes so that the output json structure from theory_label_generator.py matches our new Example format, which is the same output format that is generated by theory_generator.py. theory_label_generator.py takes two types of input format-- current and legacy, and while earlier the output structure changed depending on the input structure, now it is all standardized. This also enables us to generate a version of our existing RuleTaker datasets in the new format so that the dataset format going with this code release matches the latest documented format in this codebase. So, when you review this PR, please disregard the earlier points 1 and 5 above.

sbhaktha commented 3 years ago

Thanks for the review Michal! I made some changes per your suggestions.