w3c / cogai

for work by the Cognitive AI community group
Other
53 stars 24 forks source link

How to match lists of atomic values #30

Closed tidoust closed 3 years ago

tidoust commented 3 years ago

The matching chunk algorithm describes conditions that must hold true for a chunk to match a condition/action chunk.

Pending PR #29 adds context matching and re-words the algorithm for it to be clearer and more understandable. It also adds an editor's note that the algorithm needs to be updated to explain how to match list of atomic values. There are two main possibilities to handle a list of atomic values:

  1. "All of": All atomic values in the condition/action need to exist in the chunk being considered. The match can be strict (the chunk being considered must not have another atomic value) or loose (the chunk being considered may have an atomic value that is not in the list to match)
  2. "One of": Consider that at least one atomic value in the condition/action needs to match in the chunk being considered.

Which one should we apply?

draggett commented 3 years ago

The matching of a buffer chunk with the condition chunk is performed with graph.test_constraints in chunks.js. The code for dealing with arrays is at line 703. It clearly requires strict matching as per your "all of" option.

I used the sandbox to see what the current implementation does.

fact test {x a, b, c} with rule test {x ?x, ?y, ?z} => console {@do log; message ?x}

It failed to match, and closer inspection reveals that I hadn't yet implemented support for variables in lists. In the example ?z should bind to a, ?y to b and ?y to c. I will fix this along with adding support for sub-symbolic processing.

That leaves open the question of whether the rule language needs to cater for more flexible list matching as a native feature. It already includes support for iterating over lists, and applications may implement their own graph algorithms as new actions.

One idea is to introduce a new built-in action, @do subset, that changes the matching algorithm for this condition chunk so that for each property in the condition chunk, all of the items in that property's value must be present in the list in the corresponding buffer value, but the buffer value may contain others items as well. A further possibility would be to introduce a new psuedo property @subset that names the properties the adjusted matching algorithm applies to. That feels overly complicated to me. What do you think?

draggett commented 3 years ago

I have now implemented support for variables and ! for lists in condition chunks. See the test suite for some examples.

tidoust commented 3 years ago

It failed to match, and closer inspection reveals that I hadn't yet implemented support for variables in lists. In the example ?z should bind to a, ?y to b and ?y to c. I will fix this along with adding support for sub-symbolic processing.

This makes me realize that the Variables section also needs to be updated to describe the behavior. Playing with the sandbox, given a fact fact test {x a, b, c}, there seem to be three cases for matching variables:

  1. If a condition uses as many variables as there are items in the list, as in your example test {x ?x, ?y, ?z}. Here, we expect ?x to be bound to a, ?y to b and ?z to c.
  2. If a condition uses only one variable, e.g. test {x ?x} => console {@do log; message ?x}. Here, we'd expect ?x to be bound to the list a, b, c.
  3. If a condition uses a different number of variables, e.g. test {x ?x, ?y} => console {@do log; message ?x}. Here, the condition would simply not match the fact.

Is 2. intended? This means that, with a rule such as test {x ?x} => console {@do log; message ?x}, one cannot prevent the condition from matching chunks where ?x is a list of atomic values. That may be fine.

One idea is to introduce a new built-in action, @do subset, that changes the matching algorithm for this condition chunk so that for each property in the condition chunk, all of the items in that property's value must be present in the list in the corresponding buffer value, but the buffer value may contain others items as well. A further possibility would be to introduce a new psuedo property @subset that names the properties the adjusted matching algorithm applies to. That feels overly complicated to me. What do you think?

I think I would introduce these additional semantics only when the need arises.

tidoust commented 3 years ago

Is 2. intended?

Replying to myself following oral discussion with @draggett: yes, this is intended.