coli-saar / utool

Utool is the Swiss Army Knife of Underspecification. It is a GUI and library written in Java for performing computations with dominance graphs and other formalisms, which are used to represent semantic ambiguities in natural language processing.
http://www.coli.uni-saarland.de/projects/chorus/utool/
3 stars 0 forks source link

Utools 3.4 output problem. #6

Open Eslam34 opened 2 years ago

Eslam34 commented 2 years ago

I downloaded the new version of utools (utool-3.4.jar), and h0 is absent from the output.

text: hunt with a jacklight; mrs: http://delph-in.github.io/delphin-viz/demo/#input=hunt%20with%20a%20jacklight;&count=5&grammar=erg2018-uw&mrs=true&dmrs=true

utools-3.4 output: [plug(h12 h14) plug(h13 h4) plug(h5 h7) plug(h6 h1)] # no h0 utools-3.1.1 output: [plug(h0 h11) plug(h12 h14) plug(h13 h4) plug(h5 h7) plug(h6 h1)]

alexanderkoller commented 2 years ago

Hi, that is truly odd, thanks for letting me know. Could you post the exact MRS that you used?

arademaker commented 2 years ago

Hi @alexanderkoller, @Eslam34 is working with me! This is the MRS, do you have tools for convert it to XML or Prolog? Or do you want us to produce these for you?

% echo "hunt with a jacklight;" | ace -g erg-dict.dat -Tf1
SENT: hunt with a jacklight;
[ LTOP: h0
INDEX: e2 [ e SF: comm TENSE: pres MOOD: indicative PROG: - PERF: - ]
RELS: < [ pronoun_q<0:22> LBL: h4 ARG0: x3 [ x PERS: 2 PT: zero ] RSTR: h5 BODY: h6 ]
 [ pron<0:22> LBL: h7 ARG0: x3 ]
 [ _hunt_v_1<0:4> LBL: h1 ARG0: e2 ARG1: x3 ARG2: i8 ]
 [ _with_p<5:9> LBL: h1 ARG0: e9 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: e2 ARG2: x10 [ x PERS: 3 NUM: sg IND: + ] ]
 [ _a_q<10:11> LBL: h11 ARG0: x10 RSTR: h12 BODY: h13 ]
 [ _jacklight/NN_u_unknown<12:21> LBL: h14 ARG0: x10 ] >
HCONS: < h0 qeq h1 h5 qeq h7 h12 qeq h14 >
ICONS: < > ]
NOTE: 1 readings, added 1309 / 128 edges to chart (78 fully instantiated, 61 actives used, 46 passives used)    RAM: 2915k
NOTE: parsed 1 / 1 sentences, avg 2915k, time 0.45035s

I would use http://pydelphin.readthedocs.io for that.

alexanderkoller commented 2 years ago

Hi @alexanderkoller, @Eslam34 is working with me! This is the MRS, do you have tools for convert it to XML or Prolog? Or do you want us to produce these for you?

Thanks! Could you convert it to the exact format that you're feeding to Utool? That would help me pin it down.

The error is weird because I'm sure I didn't change anything that would cause this in the update from 3.3 to 3.4. But I suppose you were using the ten-year-old 3.1.1 from the old website before. It may take me a while to figure this one out.

Eslam34 commented 2 years ago

we are submitting the following mrs_prolog to the server:

psoa(h0,e2,
  [rel('pronoun_q',h4,
       [attrval('ARG0',x3),
        attrval('RSTR',h5),
        attrval('BODY',h6)]),
   rel('pron',h7,
       [attrval('ARG0',x3)]),
   rel('_hunt_v_1',h1,
       [attrval('ARG0',e2),
        attrval('ARG1',x3),
        attrval('ARG2',i8)]),
   rel('_with_p',h1,
       [attrval('ARG0',e9),
        attrval('ARG1',e2),
        attrval('ARG2',x10)]),
   rel('_a_q',h11,
       [attrval('ARG0',x10),
        attrval('RSTR',h12),
        attrval('BODY',h13)]),
   rel('_jacklight/nn_u_unknown',h14,
       [attrval('ARG0',x10)])],
  hcons([qeq(h0,h1),qeq(h5,h7),qeq(h12,h14)]))
alexanderkoller commented 2 years ago

Hi,

I had a look at this, and the MRS input codec deliberately removes the top fragment (h0) if it contains no labeled nodes (in your example, it doesn't). I can't remember why we made this change (as I said, this was ten years ago), but I'm sure we had a reason, and I don't want to undo the change now.

Do you absolutely need the h0, or could you work around it? This depends on what you want to do with the scoping downstream. If it's really necessary, I could add an option to the MRS codec that allows you to suppress the removal of the empty top fragment. I'd prefer not to do that, though, because I can't really judge the consequences.

alexanderkoller commented 2 years ago

For reference, this is the line where it happens, and this is the commit in which we changed it. The commit is from 2007. :)

Eslam34 commented 2 years ago

Thanks, and I think @arademaker would have a better response. but let me ask this, is it normal for the h11 ('_a_q') to be absent from the resolved scoped (utools_3.4)? and excuse me because I don't really know much about this.

alexanderkoller commented 2 years ago

is it normal for the h11 ('_a_q') to be absent from the resolved scoped (utools_3.4)?

Yes, because h11 is at the root of this scoping and doesn't need to be plugged into anything. In the other scoping, there will be an h11 (but no h4, because that's the root).

I think this will become easier to understand if you open the GUI (java -jar utool-3.4.jar display), load your MRS, and have Utool show you the solutions. See attached.

Screenshot 2021-10-14 at 11 15 00
alexanderkoller commented 2 years ago

Where are we with this? Can I close the issue?

arademaker commented 2 years ago

I had a look at this, and the MRS input codec deliberately removes the top fragment (h0) if it contains no labeled nodes (in your example, it doesn't)

what do you mean by 'labeled nodes'? In I bought a book to Ann. We have h0 as the top handle and the first HCONS is what says that this must out scope h1.

[ TOP: h0
  INDEX: e2
  RELS: < [ pron<0:1> LBL: h4 ARG0: x3 ]
          [ pronoun_q<0:1> LBL: h5 ARG0: x3 RSTR: h6 BODY: h7 ]
          [ _buy_v_1<2:8> LBL: h1 ARG0: e2 ARG1: x3 ARG2: x8 ]
          [ _a_q<9:10> LBL: h9 ARG0: x8 RSTR: h10 BODY: h11 ]
          [ _book_n_of<11:15> LBL: h12 ARG0: x8 ARG1: i13 ]
          [ _to_p<16:18> LBL: h12 ARG0: e14 ARG1: x8 ARG2: x15 ]
          [ proper_q<19:23> LBL: h16 ARG0: x15 RSTR: h17 BODY: h18 ]
          [ named<19:22> LBL: h19 ARG0: x15 CARG: "Ann" ] >
  HCONS: < h0 qeq h1 h6 qeq h4 h10 qeq h12 h17 qeq h19 > ]

One of the solutions from Utools is

[('h17', 'h19'), ('h18', 'h9'), ('h10', 'h12'), ('h11', 'h5'), ('h6', 'h4'), ('h7', 'h1')]

So to obtain the final scope tree, I need somehow to identify h16 as the root node for this solution...

alexanderkoller commented 2 years ago

"Labeled node" means that that h0 does not appear in your RELS and is not assigned a label via LBL. It only participates in qeq constraints.

You can tell that h16 is the root for this solution because it is the only handle that does not plug a hole. For instance, h4 appears as the second element in the pair (h6, h4). This means that it plugs the hole h6 (of the handle h5).

Alternatively, you could probably add a fake handle to your h0 (e.g. [ foo LBL: h_root BODY: h0 ]), and then h0 will appear explicitly in your results list. I think - I haven't tried this myself, and it's been a few years since I've last touched the code.

arademaker commented 2 years ago

Thank you, indeed, I can work with the current output. We can close this issue.

arademaker commented 1 year ago

Hi @alexanderkoller, I found another trick situation. Above, we ended up with a possible solution to identify the root of a Utool solution (plugs), the handle that does not plug a role.

Question: does it always need to be unique, right?

In the MRS below, the TOP is h0 and h25 is a labeled node but does not appear in one of the Utool solutions (plugs). I understand that this is a special situation since h25 is also used as hole in the ARG1 of nominalization.. h25 can't be the root of the final tree, right? I am thinking about the best approach to avoid it been considered a possible root. since it is also handle that does not plug a hole (appear in the plugs), it is also a hole.

A man in a black jacket is doing tricks on a motorbike

[ TOP: h0
  INDEX: e2
  RELS: < [ _a_q<0:1>           LBL: h4 ARG0: x3 RSTR: h5 BODY: h6 ]
          [ _man_n_1<2:5>       LBL: h7 ARG0: x3 ]
          [ _in_p_loc<6:8>      LBL: h7 ARG0: e8 ARG1: x3 ARG2: x9 ]
          [ _a_q<9:10>          LBL: h10 ARG0: x9 RSTR: h11 BODY: h12 ]
          [ _black_a_1<11:16>       LBL: h13 ARG0: e14 ARG1: x9 ]
          [ _jacket_n_1<17:23>      LBL: h13 ARG0: x9 ]
          [ _be_v_id<24:26>     LBL: h1 ARG0: e2 ARG1: x3 ARG2: x15 ]
          [ udef_q<27:54>       LBL: h16 ARG0: x15 RSTR: h17 BODY: h18 ]
          [ compound<27:39>     LBL: h19 ARG0: e20 ARG1: x15 ARG2: x21 ]
          [ udef_q<27:32>       LBL: h22 ARG0: x21 RSTR: h23 BODY: h24 ]
          [ _do_v_1<27:32>      LBL: h25 ARG0: e26 ARG1: i27 ARG2: i28 ]
          [ nominalization<27:32>   LBL: h29 ARG0: x21 ARG1: h25 ]
          [ _trick_n_1<33:39>       LBL: h19 ARG0: x15 ]
          [ _on_p_loc<40:42>        LBL: h19 ARG0: e30 ARG1: x15 ARG2: x31 ]
          [ _a_q<43:44>         LBL: h32 ARG0: x31 RSTR: h33 BODY: h34 ]
          [ _motorbike_n_1<45:54>   LBL: h35 ARG0: x31 ] >
  HCONS: < h0 qeq h1 h5 qeq h7 h11 qeq h13 h17 qeq h19 h23 qeq h29 h33 qeq h35 > ]

Utool plugs:

{'h33': 'h35', 'h34': 'h10', 'h11': 'h13', 'h12': 'h22', 'h23': 'h29', 'h24': 'h16', 'h17': 'h19', 'h18': 'h4', 'h5': 'h7', 'h6': 'h1'}

The UI shows the nominalization pluged

image