ablodge / leamr

A structurally comprehensive dataset of AMR-to-text alignments for coverage of a larger variety of linguistic phenomena, for research related to AMR parsing, generation, and evaluation.
15 stars 5 forks source link

alignment errors #13

Open xiulinyang opened 8 months ago

xiulinyang commented 8 months ago

Hi Austin,

When I ran align_with_pretrained_model.py, I got some errors. One seems caused by postprocess_graph (https://github.com/ablodge/leamr/blob/e1c2f8e4e46f22519e8ae4a9329359565dd5cbb2/models/subgraph_model.py#L178) so I used try except to print out alignments when postprocess_graph fails (they are shown in the error message). The other one is get_alignment. I put the error message below. I have no clue why because all the AMRs can be read by the penman package. Thanks!

 python align_with_pretrained_model.py -t  out_ewt/post_orc.txt --subgraph-model ldc+little_prince.subgraph_params.pkl --relation-model ldc+little_prince.relation_params.pkl --reentrancy-model ldc+little_prince.reentrancy_params.pkl
[amr] Loading AMRs from file: out_ewt/post_orc.txt
Loading model: ldc+little_prince.subgraph_params.pkl
Apply Rules = True
Preprocessing: 199 / 340<AMR_Alignment: subgraph>: tokens [7] nodes ['1.1.1'] edges [] (subgraph : see => see-01)
Preprocessing: 276 / 340<AMR_Alignment: subgraph>: tokens [9] nodes ['1.1.2'] edges [] (subgraph : experience => experience-01)
<AMR_Alignment: subgraph>: tokens [12] nodes ['1.1.2.2'] edges [] (subgraph : forget => forget-01)
Preprocessing coverage: 74.94%
 80%|████████████████████████████████▊        | 272/340 [00:08<00:01, 43.43it/s]0
0
0
0
100%|█████████████████████████████████████████| 340/340 [00:11<00:00, 28.35it/s]
Writing subgraph alignments to: out_ewt/post_orc.subgraph_alignments.json
Loading model: ldc+little_prince.relation_params.pkl
Preprocessing coverage: 86.34%
100%|████████████████████████████████████████| 340/340 [00:00<00:00, 777.55it/s]
Writing relation alignments to: out_ewt/post_orc.relation_alignments.json
199
276
Loading model: ldc+little_prince.reentrancy_params.pkl
276 / 340 preprocessedTraceback (most recent call last):
  File "align_with_pretrained_model.py", line 64, in <module>
    main()
  File "align_with_pretrained_model.py", line 57, in main
    reent_alignments = reent_model.align_all(eval_amrs)
  File "/local/xiulyang/leamr/models/base_model.py", line 92, in align_all
    alignments = self.get_initial_alignments(amrs, preprocess)
  File "/local/xiulyang/leamr/models/reentrancy_model.py", line 201, in get_initial_alignments
    self.align_primary_edges(amr, reentrancy_alignments)
  File "/local/xiulyang/leamr/models/reentrancy_model.py", line 174, in align_primary_edges
    rel_align = amr.get_alignment(self.relation_alignments, token_id=talign.tokens[0])
IndexError: list index out of range
nschneid commented 8 months ago

@xiulinyang Could you please provide the sentence/AMR for which it fails? (Is it from one of the AMR corpora or a parser output?)

xiulinyang commented 7 months ago

This is one example that caused errors from a parser output. Thanks!!

  File "align_with_pretrained_model.py", line 64, in <module>
    main()
  File "align_with_pretrained_model.py", line 46, in main
    rel_alignments = rel_model.align_all(eval_amrs)
  File "/local/xiulyang/leamr/models/relation_model.py", line 309, in align_all
    alignments = super().align_all(amrs, alignments, preprocess, debug)
  File "/local/xiulyang/leamr/models/base_model.py", line 102, in align_all
    aligns, scores = self.align(amr, alignments, n, unaligned, return_all=True)
  File "/local/xiulyang/leamr/models/relation_model.py", line 248, in align
    candidate_spans = [span for span in candidate_spans if (parent.tokens[0]<span[0]<child.tokens[0])
  File "/local/xiulyang/leamr/models/relation_model.py", line 249, in <listcomp>
    or (child.tokens[0]<span[0]<parent.tokens[0])
IndexError: list index out of range
# ::snt If you know someone who might fit the bill ask them to contact me and/or send me their resume .
(p0 / ask-02
    :mode imperative
    :ARG0 (p1 / you)
    :ARG1 (p2 / and-or
              :op1 (p3 / contact-01
                       :ARG0 p12
                       :ARG1 (p4 / i))
              :op2 (p5 / send-01
                       :ARG0 p12
                       :ARG1 (p6 / resume
                                 :poss p12)
                       :ARG2 p4))
    :ARG2 p12
    :condition (p7 / know-02
                   :ARG0 p1
                   :ARG1 (p8 / someone
                             :ARG1-of (p9 / fit-06
                                          :ARG2 (p10 / bill)
                                          :ARG1-of (p11 / possible-01)))))
nschneid commented 7 months ago

(BTW the sentence is from EWT, not the AMR dataset.)

ablodge commented 7 months ago

Sorry for the delay. It looks like some of the nodes aren't getting aligned. Can you please add the following lines before the line that produces the error (leamr/models/relation_model.py, line 249)? Then tell me what the output says.

if ' '.join(amr.tokens) == 'If you know someone who might fit the bill ask them to contact me and/or send me their resume .':
    print(parent.readable(amr))
    print(child.readable(amr))
xiulinyang commented 7 months ago

No problem! Here is the output:

subgraph : ask => ask-02
subgraph : you => you, imperative
parent+child
subgraph : ask => ask-02
subgraph : know => know-02
parent+child
subgraph : resume => resume
 => 
  0%|                                                                                                                                  | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "align_with_pretrained_model.py", line 64, in <module>
    main()
  File "align_with_pretrained_model.py", line 46, in main
    rel_alignments = rel_model.align_all(eval_amrs)
  File "/local/xiulyang/leamr/models/relation_model.py", line 312, in align_all
    alignments = super().align_all(amrs, alignments, preprocess, debug)
  File "/local/xiulyang/leamr/models/base_model.py", line 102, in align_all
    aligns, scores = self.align(amr, alignments, n, unaligned, return_all=True)
  File "/local/xiulyang/leamr/models/relation_model.py", line 251, in align
    candidate_spans = [span for span in candidate_spans if (parent.tokens[0]<span[0]<child.tokens[0])
  File "/local/xiulyang/leamr/models/relation_model.py", line 252, in <listcomp>
    or (child.tokens[0]<span[0]<parent.tokens[0])
IndexError: list index out of range

Thanks a lot!