goodmami / penman

PENMAN notation (e.g. AMR) in Python
https://penman.readthedocs.io/
MIT License
135 stars 26 forks source link

Add containment for trees and graphs #118

Open BramVanroy opened 11 months ago

BramVanroy commented 11 months ago

Added simple containment __contains__ for graphs and trees. Tests are added, too.

For graphs:

For trees:

I get the terminal nodes in the tree like this:

terminals = [target for _, (role, target) in self.walk() if role == "/" and is_atomic(target)]

which feels elaborate so perhaps there is a simpler way? Happy to change it if there is!

closes #10 (partially, at least, since subgraph testing is not included in this PR)

BramVanroy commented 11 months ago

Failing lint: which command should I run? I see that flake8 is in the setup.py dependencies but I am not sure which arguments to run it with for your preferences.

Failing tests: seems unrelated to this PR (an error about py36 not available).

BramVanroy commented 8 months ago

@goodmami I have considered your comments and I like the suggestion of just allowing for triple matching in the graphs only, and then also allowing users to specify "None" in a triple to ignore that field. Two questions/concerns:

What do you think?

PS: linter fails on files that I have not touched, so not sure how I can help that.

goodmami commented 8 months ago
  • Is it ensured that None is never a valid value in a triple? Because if it is, we cannot use this approach (because users might want to explicitly match on None)

None is a valid value, where it indicates something is missing (and Penman sometimes logs a warning in these cases):

>>> import penman
>>> penman.decode('()').triples
[(None, ':instance', None)]
>>> penman.decode('(a)').triples
[('a', ':instance', None)]
>>> penman.decode('(a :ARG0)').triples
Missing target: (a :ARG0)
[('a', ':instance', None), ('a', ':ARG0', None)]

I don't think roles are ever None, though, now that they are canonically written with the ::

>>> penman.decode('(a :)').triples
Missing target: (a :)
[('a', ':instance', None), ('a', ':', None)]

So, yes, it's true that you cannot use the graph methods like Graph.edges() to look for those that have a literal None value because it is overloaded in these methods to be a wildcard. The current workaround is to look through Graph.triples manually. We could use a different sentinel value for the wildcard, such as Ellipsis, but it would be a backward incompatible change to make None function as a literal value, so there would need to be a deprecation period. I think the workaround suffices for now.

  • In terms of typing, this would require a tiny additional change. Currently Triple only allows str as the Variable and Role and None is not valid. So for valid typing, this should change - or I change the signature of contains so that item can only be a Tuple[Variable|None, Constant, Role|None].

I think you're right. Let's hold off on changing Role to allow for None since I don't think that's possible (correct me if I'm wrong). But for the triple source, penman.types.BasicTriple also needs to be similarly changed.