snowblink14 / smatch

Smatch tool: evaluation of AMR semantic structures
MIT License
63 stars 25 forks source link

`:mod` relation ignored #44

Open jheinecke opened 7 months ago

jheinecke commented 7 months ago

I came across a difference in AMR graphis which is not detected by smatch. comparing these two AMR graphs outputs a R/P/F of 1.00/1.00/1.00 by smatch (I am aware that :mod expressive is not valid AMR but nevertheless it is what my AMR Parser created, and smatch should detect it)

(d / do-02
      :ARG0 (ii / i)
      :ARG1 (a / about
            :op1 (d2 / disease
                  :name (n / name
                        :op1 "OCD")))
      :location (p / psychology)
      :time (t / today))
(d / do-02
      :ARG0 (ii / i)
      :ARG1 (a / about
            :op1 (d2 / disease
                  :name (n / name
                        :op1 "OCD")))
      :location (p / psychology)
      :time (t / today)
      :mod expressive)

I thinks this is due to the special treatment of :mod/:domain. any other relation is detected by the current version of smatch. For instance changing :mod expressive into :op1 expressive or even :toto expressive makes smatch detecting the error and outputs an F-score of 0.9677. My first guess is, that :mod is replaced by :domain (and start and end point reversed) in amr.sty

There is another AMR difference not detected by smatch :

# ::id ENG_NA_020001_20161020_G0023FSVB_0001.4
# ::snt we are dyin of thirst in MARTISAN 25BIS its***medicine without frontier they brought food an water for us..
(m / multi-sentence
      :snt1 (d / die-01
            :ARG1 (w / we)
            :ARG1-of (c / cause-01
                  :ARG0 (t / thirst-01
                        :ARG0 w))
            :location (s / street-address-91
                  :ARG1 "25 bis"
                  :ARG2 (r / road
                        :name (n / name
                              :op1 "Martisan"))))
      :snt2 (b / bring-01
            :ARG0 (o / organization
                  :name (n2 / name
                        :op1 "Doctors"
                        :op2 "without"
                        :op3 "Frontiers"))
            :ARG1 (a2 / and
                  :op1 (f / food)
                  :op2 (w2 / water))
            :ARG2 (w3 / we)))

and (note the :ARG1 "25" which is :ARG1 "25 bis" in the graph above

(m / multi-sentence
      :snt1 (d / die-01
            :ARG1 (w / we)
            :ARG1-of (c / cause-01
                  :ARG0 (t / thirst-01
                        :ARG0 w))
            :location (s / street-address-91
                  :ARG1 "25 bis"
                  :ARG2 (r / road
                        :name (n / name
                              :op1 "Martisan"))))
      :snt2 (b / bring-01
            :ARG0 (o / organization
                  :name (n2 / name
                        :op1 "Doctors"
                        :op2 "without"
                        :op3 "Frontiers"))
            :ARG1 (a2 / and
                  :op1 (f / food)
                  :op2 (w2 / water))
            :ARG2 (w3 / we)))
goodmami commented 7 months ago

@jheinecke it looks like this is the same issue as #26 and #39: Smatch does not handle triples when the source is a constant. This situation comes about when inverted roles point to a constant, and since #16 :mod is considered to be inverted. The other roles you tried are not known special cases of inverted roles and they do not end in -of, so they behave as regular roles which may have constant targets. I suspect you'll also see a score of 1.0 if you made them :toto-of, etc.

Also for your second set of examples, the PENMAN strings look identical to me, aside from the metadata comments above the first one. Did you mean to paste something different for the second?

flipz357 commented 7 months ago

@jheinecke Can you try with SMATCH++ ?

When I feed the first two AMRs in SMATCH++ I get:

python -m smatchpp -a a1.txt -b a2.txt

F1: 96.77    Precision: 100.0    Recall: 93.75

which is correct (first graph has 15 triples, second has 16 triples, first graph is a subgraph of the second, i.e., recall = 15/16 = 0.9375).

jheinecke commented 7 months ago

Hi @flipz357 and @goodmami , I get the same results as you with smatchpp on the first example (so it detects the difference). I'm sorry for the bad copy-and-paste in the second example. The second graph should read

(m / multi-sentence
      :snt1 (d / die-01
            :ARG1 (w / we)
            :ARG1-of (c / cause-01
                  :ARG0 (t / thirst-01
                        :ARG0 w))
            :location (s / street-address-91
                  :ARG1 "25"
                  :ARG2 (r / road
                        :name (n / name
                              :op1 "Martisan"))))
      :snt2 (b / bring-01
            :ARG0 (o / organization
                  :name (n2 / name
                        :op1 "Doctors"
                        :op2 "without"
                        :op3 "Frontiers"))
            :ARG1 (a2 / and
                  :op1 (f / food)
                  :op2 (w2 / water))
            :ARG2 (w3 / we)))

The difference is :ARG1 "25" instead of :ARG1 "25 bis" in line 8. Again smatchpp sees the difference (F1: 97.22 Precision: 97.22 Recall: 97.22)

@goodmami I think there could be a check in amr.py parse_AMR_line() just before line 397 whether all nodes in node_relation_dict1 and node_relation_dict2 exist in node_name_list, and if not adding it to the attribute_list ?

goodmami commented 7 months ago

@jheinecke I'm not very familiar with amr.py, but what you describe sounds similar to what the Penman library does. Basically, any time you encounter (x /..., x is a node identifier, and for any :role y where y is not a node identifier, the triple is considered an attribute, even if the role appears inverted (:role-of).

jheinecke commented 7 months ago

It has indeed exactly the same function as the penman library. I am working on a version of this smatch which uses the penman library instead of the local amr.py in order to get rid of this mod/domain problem (it must not replace x :mod y with y :domain x if y is a literal (string or number) and not an instance. The current amr.py does not check