IBM / transition-amr-parser

SoTA Abstract Meaning Representation (AMR) parsing with word-node alignments in Pytorch. Includes checkpoints and other tools such as statistical significance Smatch.
Apache License 2.0
246 stars 48 forks source link

Stochastic results with the pretrained parser #49

Closed Tverous closed 1 year ago

Tverous commented 1 year ago
from transition_amr_parser.parse import AMRParser
parser = AMRParser.from_checkpoint('DATA/amr2joint_ontowiki2_g2g/models/amr2joint_ontowiki2_g2g-structured-bart-large/seed44/checkpoint_wiki.smatch_top5-avg.pt')
# use parse_sentences() for a batch of sentences
tokens, positions = parser.tokenize('The coronavirus pandemic has upended the traditional runway format, and in its place a mix of virtual and, in some cases, physical shows with limited audience numbers has started to roll out.')
annotations, decoding_data = parser.parse_sentence(tokens)
# Print Penman 
print(annotations)

Given the above scripts.

Sometimes the logs will show WARNING: disconnected graphs, sometimes not, resulting inconsistent results (penman notations in here) from the same sentence and same pretrained weights.

For examples:

Running on batch size: 1
1
decoding: 100%|██████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.19it/s]
WARNING: disconnected graphs
# ::tok The coronavirus pandemic has upended the traditional runway format , and in its place a mix of virtual and , in some cases , physical shows with limited audience numbers has started to roll out .
(a / and~10
    :op1 (u / up-01~4
        :ARG0 (p / pandemic~2
            :mod (v2 / virus~1))
        :ARG1 (f / format~8
            :ARG1-of (s5 / stud-01~7)
            :mod (r3 / runway~7)
            :mod u))
    :op2 (s4 / start-01~31
        :ARG1 (r2 / roll-out-02~33
            :ARG1 (m / mix-01~15
                :ARG1 (s2 / show-04~25
                    :mod (v / virtual~17))
                :ARG2 (s / show-04~18
                    :mod (c / case-04~22
                        :quant (s3 / some~21))
                    :mod (p2 / physical~24)
                    :prep-with (n / number~29
                        :ARG1-of (l / limit-01~27)
                        :quant-of (a2 / audience~28)))))
        :ARG2-of (r / replace-01~13
            :ARG1 f))
    :rel (t / tradition~6))

and

Running on batch size: 1
1
decoding: 100%|██████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.40s/it]
# ::tok The coronavirus pandemic has upended the traditional runway format , and in its place a mix of virtual and , in some cases , physical shows with limited audience numbers has started to roll out .
(a / and~10
    :op1 (u / upheaval-01~4
        :ARG0 (p / pandemic~2
            :mod (c2 / coronavirus~1))
        :ARG1 (f / format~8
            :mod (r3 / runway~7)
            :mod (t / tradition~6)))
    :op2 (s4 / start-01~31
        :ARG1 (r2 / roll-out-02~33
            :ARG1 (m / mix-01~15
                :ARG1 (s2 / show-04~25
                    :mod (v / virtual~17))
                :ARG2 (s / show-04~16
                    :mod (c / case-04~22
                        :quant (s3 / some~21))
                    :mod (p2 / physical~24)
                    :prep-with (n / number~29
                        :ARG1-of (l / limit-01~27)
                        :quant-of (a2 / audience~28)))))
        :ARG2-of (r / replace-01~13
            :ARG1 f)))

Is it expected for this kind of behaviors?

Thank you,

ramon-astudillo commented 1 year ago

The disconnected graphs is something that can happen when the parser can not attach a sub-graph (leads to the :rel relations).

The stochastic behavior is something that should not happen. It may be some set() issue. We will have to check.

tingchihc commented 1 year ago

image

gxxu-ml commented 1 year ago

I have done 100 rounds using the same checkpoint, all are the same:

# ::tok The coronavirus pandemic has upended the traditional runway format , and in its place a mix of virtual and , in some cases , physical shows with limited audience numbers has started to roll out .
(a / and~10
    :op1 (u / upheaval-01~4
        :ARG0 (p / pandemic~2
            :mod (c2 / coronavirus~1))
        :ARG1 (f / format~8
            :mod (r3 / runway~7)
            :mod (t / tradition~6)))
    :op2 (s4 / start-01~31
        :ARG1 (r2 / roll-out-02~33
            :ARG1 (m / mix-01~15
                :ARG1 (s2 / show-04~25
                    :mod (v / virtual~17))
                :ARG2 (s / show-04~16
                    :mod (c / case-04~22
                        :quant (s3 / some~21))
                    :mod (p2 / physical~24)
                    :prep-with (n / number~29
                        :ARG1-of (l / limit-01~27)
                        :quant-of (a2 / audience~28)))))
        :ARG2-of (r / replace-01~13
            :ARG1 f)))
ramon-astudillo commented 1 year ago

To clarify @Tverous this is for the upcoming v0.5.3 which will be released this week. We did not do any fix having to do with stochastic, though.

Tverous commented 1 year ago

I appreciate your assistance greatly.

Nonetheless, the results I'm obtaining remain inconsistent despite confirming that the versions of all installed packages align with those detailed in the README.

I'll give it another try following the upcoming release.

P.S. I am running the program in a docker container.

ramon-astudillo commented 1 year ago

we just updated the version, let us know if you still encounter the bug.

Tverous commented 1 year ago

The issue has been fixed with the updates.

I'm no longer receiving any inconsistent results.

Thank yo so much! I greatly appreciate your help.

I will close the issue now.