Streamline grounding code

clulab / eidos

Machine reading system for World Modelers

Apache License 2.0

36 stars 24 forks source link

Streamline grounding code #1096

Closed kwalcock closed 2 years ago

kwalcock commented 2 years ago

These are the proposed changes to the grounding code. I'm still going to go over it one more time myself.

zupon commented 2 years ago

This looks gorgeous! Thanks for all the streamlining!

kwalcock commented 2 years ago

If these tests still pass (and after I fix a typo), then I'm probably done with it. No functionality is supposed to have changed. The grounding with predicates was divided again into functions for exact and inexact matching, just for length. That might mess up some documentation :-( Some names have been changed. I hope it didn't get out of hand.

kwalcock commented 2 years ago

That will teach me not to fix typos. I'm looking into this.

[info] - should process "The brewing conflict had already had a serious impact in disrupting farming which had led to higher prices ." effect themeProperty correctly *** FAILED ***
[info]   "[]" was not equal to "[wm/property/price_or_cost]" (TestGrounding.scala:40)
[info]   Analysis:
[info]   "[]" -> "[wm/property/price_or_cost]"

kwalcock commented 2 years ago

There seems to be a problem. I had changed the order of

      val predicateGroundings = inexactPredicateGroundings :+ exactPredicateGrounding

      val predicateGroundings = exactPredicateGrounding +: inexactPredicateGroundings

before the line

      val sortedPredicateGroundings = predicateGroundings.sortBy(-_.score)

figuring that the exact match will more often have a higher score, so it is better to put it at the front of the list and not have to sort it all the way there.

In this case they are all tied at zero, though. The exact one is empty, with a predicateTuple of (None, None None, None). That's an obvious score of zero. The other two candidates are (Some, None, None, None), but they also have scores of zero. I'm looking into it.

kwalcock commented 2 years ago

I should have written (None, Some, None, None). It is the themeProperties that are filled in. If that is a valid grounding, and it is something that can be produced as a result, then I think that the themeProperties and themeProcessProperties need to be taken into account in PredicateTuple.score. Right now only the theme and themeProcess are used. Should the properties by discounted by half (or something), or are they just as significant?

kwalcock commented 2 years ago

The change to PredicateTuple.score was made the tests pass locally, so I will push it. However, it should be looked at critically. Perhaps some of the tests can be rerun to see if it has made any difference elsewhere.

zupon commented 2 years ago

Thanks for looking into this in depth! This sounds like a legacy of when we had to have a theme (so it was required in calculating the PredicateTuple.score), but now that a theme is optional, I think it's okay to change PredicateTuple.score to account for the various properties.

zupon commented 2 years ago

Now that everything is passing again, is this okay to merge into the comp_grounding_unit_tests branch?

kwalcock commented 2 years ago

Sounds good, especially if I don't merge my own code.