clulab / eidos

Machine reading system for World Modelers
Apache License 2.0
36 stars 24 forks source link

Add grounding slot filter #1072

Closed zupon closed 3 years ago

zupon commented 3 years ago

Adds a filter to the groundings when crafting a PredicateTuple to make sure theme groundings are in the concept branch, process groundings are in process branch, and themeProperty and processProperty groundings are in the property branch of the compositional ontology.

zupon commented 3 years ago

@kwalcock This could probably done more neatly and systematically, but it seems to work. All my current unit tests checking for grounding slots going to the correct branch now pass!

kwalcock commented 3 years ago

It looks to me like the filtering could be moved to the constructor of PredicateTuple and then happen systematically, as you mention. At least any time something is not filtered, it seems to be empty and extra filtering won't hurt. The slot you are filtering for might already be known as the branch (or maybe better, branchOpt), and if so, one could check that rather directly rather than resorting to contains(). It would be faster and then we also don't have to avoid having a slot name further down the string.

Let me know if you want that comp_grounding_unit_tests merged so that you don't need a PR on a PR. It seemed like this is all a work in progress that might be merged once or twice close to the end, but if that's a problem, just say.

kwalcock commented 3 years ago

P.S. Some of those strings seem to be recorded in SRLCompositionalGrounder.PROCESS, etc.

MihaiSurdeanu commented 3 years ago

Yes, I think @kwalcock is right: I think this filter is better inside the c'tor for PredicateTuple. More modular, less repeated code.

zupon commented 3 years ago

I'll work on it!

kwalcock commented 3 years ago

So I was wrong about the branch being available. Until that gets updated somehow, your version with the / on each side was better. Those strings are used elsewhere in the ontology and might be picked up by mistake. One day I will go back over the types there and clean them up and hopefully fix that.

As you noticed, it's difficult to change the constructor of a case class. I think that the standard way to do it is to leave the constructor unchanged and instead add an apply method to the companion object:

object PredicateTuple {
  def apply(
    theme: OntologyGrounding,
    themeProperties: OntologyGrounding,
    themeProcess: OntologyGrounding,
    themeProcessProperties: OntologyGrounding,
    predicates: Set[Int]
  ): PredicateTuple = new PredicateTuple(
    theme.filterSlots(SRLCompositionalGrounder.CONCEPT),
    themeProperties.filterSlots(SRLCompositionalGrounder.PROPERTY),
    themeProcess.filterSlots(SRLCompositionalGrounder.PROCESS),
    themeProcessProperties.filterSlots(SRLCompositionalGrounder.PROPERTY),
    predicates
  )
}

This will avoid the vars. The compiler seems to be OK with that and the constructor gets called with the modified values. One just needs to be sure not to accidentally use new PredicateTuple() rather than simply PredicateTuple(), because the filtering would be skipped.

kwalcock commented 3 years ago

Drat. I had only checked that with Scala 2.12, but it looks like 2.11 doesn't like it. I'm working on it.

kwalcock commented 3 years ago

There's a PR with a patch. If it works, I'll merge it into this one. Thanks for your patience.

kwalcock commented 3 years ago

A patch was merged. Not all the tests were passing before, so I'm not immediately sure whether something went wrong or not. Someone else will certainly notice, though.

kwalcock commented 3 years ago

I did notice that when the filters moved to the constructor, that not all the other code returned to its previous state. The top PredicateTuple in createPredicateTupleFromPath changed. There was a fixme there, so it was probably intentional, but I mention it just in case.

zupon commented 3 years ago

Thanks for the patch! I don't think that fixme is a problem now. Are the checks failing now because some failing tests are marked as passing still? There should still be failing tests (maybe actually set to failingTest, but on my end at least all the ones that are "ground to proper branch" appear to be passing. If that's the issue, we can merge this into the comp_grounding_unit_tests branch, I think.

kwalcock commented 3 years ago

Tests are failing from these files, in case you want to look into them without running the whole slow. Merging between non-master branches is fine, even without my commentary. You are welcome to that button.

[error] Failed tests:
[error]     org.clulab.wm.eidos.text.englishGrounding.TestSRLGrounder
[error]     org.clulab.wm.eidos.text.englishGrounding.TestGrounding
[error]     org.clulab.wm.eidos.serialization.jsonld.TestJLDDeserializer
zupon commented 3 years ago

TestSRLGrounder is failing because it wants to ground a property to the process slot, which my filter now prohibits. Similarly, TestJLDDeserializer fails because it appears the output has a concept grounding in the process or process properties slot, which is again prohibited. I will still merge this into the other testing branch, and I can adjust these failing tests there before we merge it all into master down the road.

kwalcock commented 3 years ago

Sounds good!