Problem with recognizing unseen NEs

flipz357 commented 4 years ago

Hi Deng,

I am now able to parse arbitrary sentences with your pre-trained model, thanks to your valuable tips!

My pipeline looks as follows:

annotate_features.sh directory
preprocess_2.0.sh
work.sh
postprocess_2.0.sh ckpt.pt_test_out.pred

That writes a file ckpt.pt_test_out.pred.post, of which I assume that it is the final result.

However, the quality of arbitrary sentence parses is not so good. It does seem to struggle a bit with unseen named entities and this leads to errors. Named entities that are in the training data are perfectly recognized, but not so much new ones. Here is an example of a short sentence:

# ::id dasd
# ::snt In November 1581, Feodor's elder brother Ivan Ivanovich was killed by their father in a fit of rage.
# ::tokens ["In", "November", "1581", ",", "Feodor", "'s", "elder", "brother", "Ivan", "Ivanovich", "was", "killed", "by", "their", "father", "in", "a", "fit", "of", "rage", "."]
# ::lemmas ["in", "November", "1581", ",", "Feodor", "'s", "elder", "brother", "Ivan", "Ivanovich", "be", "kill", "by", "they", "father", "in", "a", "fit", "of", "rage", "."]
# ::pos_tags ["IN", "NNP", "CD", ",", "NNP", "POS", "JJR", "NN", "NNP", "NNP", "VBD", "VBN", "IN", "PRP$", "NN", "IN", "DT", "NN", "IN", "NN", "."]
# ::ner_tags ["O", "DATE", "DATE", "O", "PERSON", "O", "O", "O", "PERSON", "PERSON", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"]
# ::abstract_map {}
(c0 / kill-01
      :ARG0 (c2 / person
            :ARG0-of (c7 / have-rel-role-91
                  :ARG1 (c11 / person)
                  :ARG2 (c13 / brother)))
      :ARG1 (c1 / person
            :ARG0-of (c6 / have-rel-role-91
                  :ARG0 c2
                  :ARG2 c11
                  :ARG2 (c12 / father))
            :mod (c5 / elder
                  :domain c2))
      :time (c3 / name
            :op1 "november"
            :op2 "ivan"
            :op3 "1581")
      :time (c4 / rage-02
            :ARG1 c1))

The parser has struggled with new named entities, apparently. It also contains, due to this, many other errors (e.g. two ARG2 of c6). It has also not even detected tsar "Feodor".

Here is an example of a longer sentence, this time from US press.

# ::id sda sd
# ::snt The Grand Slam at Flushing Meadows is scheduled to begin on August 31, but with New York one of the cities hardest hit by coronavirus there are doubts over whether the tournament can take place.
# ::tokens ["The", "Grand", "Slam", "at", "Flushing", "Meadows", "is", "scheduled", "to", "begin", "on", "August", "31", ",", "but", "with", "New", "York", "one", "of", "the", "cities", "hardest", "hit", "by", "coronavirus", "there", "are", "doubts", "over", "whether", "the", "tournament", "can", "take", "place", "."]
# ::lemmas ["the", "Grand", "Slam", "at", "Flushing", "Meadows", "be", "schedule", "to", "begin", "on", "August", "31", ",", "but", "with", "New", "York", "one", "of", "the", "city", "hardest", "hit", "by", "coronavirus", "there", "be", "doubt", "over", "whether", "the", "tournament", "can", "take", "place", "."]
# ::pos_tags ["DT", "NNP", "NNP", "IN", "NNP", "NNP", "VBZ", "VBN", "TO", "VB", "IN", "NNP", "CD", ",", "CC", "IN", "NNP", "NNP", "CD", "IN", "DT", "NNS", "RBS", "VBN", "IN", "NN", "EX", "VBP", "NNS", "IN", "IN", "DT", "NN", "MD", "VB", "NN", "."]
# ::ner_tags ["O", "MISC", "MISC", "O", "LOCATION", "LOCATION", "O", "O", "O", "O", "O", "DATE", "DATE", "O", "O", "O", "STATE_OR_PROVINCE", "STATE_OR_PROVINCE", "NUMBER", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"]
# ::abstract_map {}
(c0 / have-concession-91
      :ARG1 (c1 / schedule-01
            :ARG1 (c3 / begin-01)
            :ARG3 (c4 / date-entity
                  :mod 31
                  :mod (c9 / august)))
      :ARG2 (c2 / doubt-01
            :ARG1 (c5 / possible-01
                  :ARG1 (c10 / take-01
                        :ARG1 (c12 / grand)
                        :ARG1 (c13 / manufacture-01)))
            :ARG1-of (c6 / cause-01
                  :ARG0 (c11 / include-91
                        :ARG2 (c14 / city
                              :ARG1-of (c15 / hit-01
                                    :ARG0-of (c17 / near-02
                                          :degree (c18 / most))
                                    :ARG2 (c16 / coronavirus)))))
            :topic (c7 / slam-02
                  :ARG1 c12)))

Again, it has not properly recognized any named entity and because of this (?) made also many other error (like Grand Slam --> "grand manufacture")

A last example, from sports.


# ::id sada
# ::snt Zverev joked that he had been persuaded to play at the tournament by a threat from Djokovic that he would never let him win against him otherwise.
# ::tokens ["Zverev", "joked", "that", "he", "had", "been", "persuaded", "to", "play", "at", "the", "tournament", "by", "a", "threat", "from", "Djokovic", "that", "he", "would", "never", "let", "him", "win", "against", "him", "otherwise", "."]
# ::lemmas ["Zverev", "joke", "that", "he", "have", "be", "persuade", "to", "play", "at", "the", "tournament", "by", "a", "threat", "from", "Djokovic", "that", "he", "would", "never", "let", "he", "win", "against", "he", "otherwise", "."]
# ::pos_tags ["NNP", "VBD", "IN", "PRP", "VBD", "VBN", "VBN", "TO", "VB", "IN", "DT", "NN", "IN", "DT", "NN", "IN", "NNP", "IN", "PRP", "MD", "RB", "VB", "PRP", "VB", "IN", "PRP", "RB", "."]
# ::ner_tags ["PERSON", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "PERSON", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"]
# ::abstract_map {}
(c0 / joke-01
      :ARG0 (c1 / person 
            :ARG0-of (c3 / have-org-role-91
                  :ARG2 (c8 / zverev))
            :ARG0-of (c4 / play-01
                  :ARG1 (c9 / date-entity))
            :ARG0-of (c5 / let-01
                  :ARG1 (c6 / win-01
                        :ARG2 c1)
                  :time (c7 / threaten-01
                        :ARG0 (c12 / company)
                        :ARG2 c1)
                  :time (c10 / otherwise
                        :op1 c6)
                  :time (c11 / ever)))
      :ARG2 (c2 / persuade-01
            :ARG0 c7
            :ARG1 c1))

Again, all named entities were not recognized and it has hallucinated new concepts (e.g., "(c12 / company)"). The famous tennis player Djiokovic does not even occur in the parse.

These sentences are just randomly sampled, all my outputs look more or less like this. Do you have any idea where the problem could be? The problem doesn't seem to be in the post-processing, the NE errors are already contained (mostly, as far as I can assess this) in the parser output file ckpt.pt_test_out/ckpt.pt_test_out.pred.

Could it be because the # ::abstract_map {} is always empty?

jcyk commented 4 years ago

@fliegenpilz357 Yes. I think you are right. The problem may be with the empty abstract_map(s). FYI, the preprocessing scripts for named entities are borrowed from https://github.com/sheng-z/stog. I will try to find some time investigating this issue.

Let me know if you have any new findings. Thank you.

UPDATE: This should be related to pre-processing and post-processing. I mean that, if the abstract_map is empty, the entities are lost in the pre-processing stage, and both the parser and the post-processing can do nothing about it.

flipz357 commented 4 years ago

I also think that it has something to do with the pre-processing from https://github.com/sheng-z/stog.

When predicting the test sentences from LDC everything seems to work fine, it's just for new arbitrary sentences that it doesn't work well, for some reason.

Again, thanks for the quick answer. I will let you know if I find something out about this.

mkartik commented 4 years ago

Hi Deng,

I also came across this issue of the abstract_map not getting populated for unseen named entities, and working perfectly fine for those in the LDC dataset. While analyzing the pre-processing code, more specifically, the text_anonymizor.py file, I observed that the named entities are compared with a pre-built dictionary present in 'text_anonymization_rules.json' file [part of the amr_2.0_utils folder] and only if a match is found, the abstract_map gets populated.

A work-around would be to regenerate the amr_2.0_utils, using the script provided here: https://github.com/sheng-z/stog/issues/3#issuecomment-639067118

flipz357 commented 4 years ago

Thanks @mkartik

But that doesn't explain why it would work well for the test partition of LDC, does it?

Anyways, may I ask you if you have tried out this workaround and were able to parse unseen sentences with better quality?

jcyk commented 4 years ago

@mkartik Thx for pointing out it! However, Does the regenerating requires gold AMRs? if so, this cannot solve our problem here. Do you think it is possible to relax the "only if a match is found" condition by modifying the TextAnonymizor?

mkartik commented 4 years ago

@fliegenpilz357 , the entries(namely named-entities) in the test partition of LDC are similar to those of the train and dev partition of the LDC dataset, and hence, that is why I suppose it works for the test partition of LDC. I haven't yet tried the workaround for regenerating the amr_utils, but would be doing so in the coming days.

@jcyk I guess regeneration would not require gold AMRs and we would be able to regenerate using our own annotated training data. It could be possible to update the '_replace_span' function in the text_anonymizer.py file to relax the match constraint, but I haven't yet explored that option.

flipz357 commented 4 years ago

Thanks a lot @mkartik for this helpful investigation!

Yes, there is a significant overlap (unfortunately) between train NEs and test NEs. But I think it still does not explain everything, the overlap is not 100%. For example, I have just performed the following little experiment.

I searched for a LDC testing sentence that contains both NEs that are seen in training and not seen in training. here is such a sentence:

Guofang Shen , The foreign ministry spokesperson , announced at a news conference held this afternoon that President Gentzs Aerpade of the Hungary Republic , would pay a State visit to China from September 14th to the 17th at the invitation of president Zemin Jiang .

grep "Guofang Shen" data/AMR/amr_2.0/train.txt | wc -l

= 0 (likewise on dev) And likewise with "Gentzs Aerpade"

I (very slightly) manipulate the sentence such that I change only NEs that are not seen in train or dev ("Guofang Shen" and "Gentzs Aerpade").

Expectation: It should not matter for the parser/preprocessing (much?).

Result: It matters a lot. The preprocessing/parser performs perfectly on the original sentences even when they have entities that are not seen in train or dev. But, the preprocessing/parser performs worse when we change the unseen named entities to other unseen named entities.

Example:

[EDIT: removed long complicated example, see simplified example in my next post below]

jcyk commented 4 years ago

@fliegenpilz357 hi, I think this example is too complicated. Do you mean change "Guofang Shen" to "Hua Chunying"? And the results show that "Guofang Shen" can be recognized but "Hua Chunying" cannot be recognized, despite that they are both unseen entities?

flipz357 commented 4 years ago

Yes, that's what I mean. I am very sorry for the complicated example. I simplified the example a lot now, I hope it is now more clear. Again, I'm sorry for the long example above.

If I am not mistaken "Guofang Shen" and "Gentzs Aerpade" are both only in LDC test but not in train/dev, so replacing them should not have any (major) effect.

# ::id ORIGINAL - shortened
# ::snt Guofang Shen calls President Gentzs Aerpade .
# ::tokens ["PERSON_1", "calls", "President", "PERSON_2", "."]
# ::lemmas ["PERSON_1", "call", "President", "PERSON_2", "."]
# ::pos_tags ["NNP", "VBZ", "NNP", "NNP", "."]
# ::ner_tags ["PERSON", "O", "TITLE", "PERSON", "O"]
# ::abstract_map {"PERSON_1": {"type": "named-entity", "span": "Guofang Shen", "ner": "PERSON", "ops": "Guofang Shen"}, "PERSON_2": {"type": "named-entity", "span": "Gentzs Aerpade", "ner": "PERSON", "ops": "Gentzs Aerpade"}}
(c0 / call-02
      :ARG0 (c2 / person
            :name (c5 / name
                  :op1 "Guofang"
                  :op2 "Shen")
            :wiki -)
      :ARG1 (c1 / person
            :ARG0-of (c4 / have-org-role-91
                  :ARG2 (c6 / president))
            :name (c3 / name
                  :op1 "Gentzs"
                  :op2 "Aerpade")
            :wiki -))

# ::id ORIGINAL - shortened NE changed
# ::snt Hua Chunying calls President Viktor Orban .
# ::tokens ["Hua", "Chunying", "calls", "President", "Viktor", "Orban", "."]
# ::lemmas ["Hua", "Chunying", "call", "President", "Viktor", "Orban", "."]
# ::pos_tags ["NNP", "NNP", "VBZ", "NNP", "NNP", "NNP", "."]
# ::ner_tags ["PERSON", "PERSON", "O", "TITLE", "PERSON", "PERSON", "O"]
# ::abstract_map {}
(c0 / call-02
      :ARG0 (c2 / person)
      :ARG1 (c1 / person
            :ARG0-of (c3 / hua)
            :ARG0-of (c4 / have-org-role-91)))

As can be seen, there is already some manipulations at the tokenizing (# ::tokens) going on. In the end, this may be related to the empty abstract map which again propagates the error into the parser.

jcyk commented 4 years ago

@fliegenpilz357 That is very interesting! "Guofang Shen" is in the abstract_map, but "Hua Chunying" is not.

As pointed out by @mkartik, an entity can only be recognized only if a match is found. then the "Guofang Shen" must be included in the rules. I would suspect that the original author used the test set for building their utils (text_anonymization_rules.json). However, as long as that thetext_anonymization_rules.json building does not rely on gold AMRs, it is acceptable.

By the way, since the problem is with the pre-processing and post-processing, which are simply borrowed from the stog repo. I think it is better to discuss with the original authors in stog.

flipz357 commented 4 years ago

Yes it is very interesting, and I agree that the bug is not exactly in your parser but in the pre-processing in stog. There are also some similar issues reported already over there, as I see now. Unfortunately Sheng seems very busy and does not respond (much). So, @jcyk I appreciate your help and quick answers a lot! I don't expect you to fix this, but if, by any chance, you find out what's causing the pre-processing trouble, it would be awesome to just let me know or update this issue.

jcyk commented 4 years ago

@fliegenpilz357 As suggested by @mkartik, one possible solution is to regenerate the 'text_anonymization_rules.json'. Another possible solution is to modify the preprocessing rules ( I guess we better remove this inconvenient dependency entirely). I do plan to investigate both options, but maybe not in a short time. So, please do let me know if you find a good solution. Thx!

Note that the parser is supposed to see anonymized entities only, it cannot handle any specific names. Therefore, if the preprocessing does not catch the entity, the parser can do nothing about it.

Also please note that there is a no-graph-recategorization version parser in this repo, though, with slightly worse accuracy in the official evaluation. Since this doesn’t rely on such hard-coded rules, it will not have this problem (or even generalize better?)

Best, Deng

On Wed, 24 Jun 2020 at 6:02 PM, fliegenpilz357 notifications@github.com wrote:

Yes it is very interesting, and I fully agree that the bug is not in your parser but in the pre-processing in stog https://github.com/sheng-z/stog. There are also some similar issues reported already in there, as I see now. Unfortunately Sheng seems very busy and does not respond (much). @jcyk https://github.com/jcyk I appreciate your help and quick answers a lot!

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/jcyk/AMR-gs/issues/5#issuecomment-648725119, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYLVME6BOOGJ2BWBI4RKZ3RYHFLBANCNFSM4N5JTOUQ .

flipz357 commented 4 years ago

@jcyk I can confirm that it works so much better with the non GR model. Thanks for your quick advice and again, I congratulate you for your very awesome work! I even like the non-GR approach a lot more than the one which depends on stogs super-complicated anonymization, even though it's slightly worse in Smatch. I even suspect that it is better in generalizing than the 80 Smatch model, but it's hard to tell without knowing the exact bug in stog pre-processing or fully understanding the complicated pre-processing pipeline of stog.

Hence, I'd still leave this issue open. It might be interesting for other people too and perhaps someone might find out what's the bug in the pre-processing, then the 80 Smatch model can also be used for unseen sentences/entities.

jcyk / AMR-gs

Problem with recognizing unseen NEs #5