delph-in / zhong

The zhong [|] Chinese grammars
MIT License
11 stars 8 forks source link

Variables in ICONS do not correspond to EPs in the MRS #2

Open goodmami opened 8 years ago

goodmami commented 8 years ago

There are some sentences where variables appearing in the ICONS do not appear as the intrinsic variable (ARG0) of any EP in the MRS. Likely there is some bug in the grammar during MRS construction, e.g. where the original EP gets dropped or something.

Here are some example sentences from the mrs testsuite:

i-id result-ids i-input
24 4, 5, 6 一 只 猫 也 没 吠
29 1, 3, 6 张三 想 知道 是 哪 只 狗 吠 了
30 1, 2 张三 想 知道 李四 吠 没 吠

There are many more in the MRS testsuite. I don't know if they all are caused by the same grammar bug or relate to the same grammatical phenomenon.

These variables become evident because variable properties appear inside the ICONS list. It may be ok that these properties appear here, but they only do (with, say, the ACE parser) when the variable doesn't appear anywhere else. You can search for more examples in the testsuite with this regex: ICONS: <.*\[ (you may need to escape some things differently based on your regex processor; I used vim).

sanghoun commented 8 years ago

A quick comment! 吠 is used in the MRS testsuite for Zhong no longer. I think the version is out-of-date, but I will check the problem.

Sanghoun

On Mon, Jul 20, 2015 at 11:34 AM, Michael Wayne Goodman < notifications@github.com> wrote:

There are some sentences where variables appearing in the ICONS do not appear as the intrinsic variable (ARG0) of any EP in the MRS. Likely there is some bug in the grammar during MRS construction, e.g. where the original EP gets dropped or something.

Here are some example sentences from the mrs testsuite: i-id result-ids i-input 24 4, 5, 6 一 只 猫 也 没 吠 29 1, 3, 6 张三 想 知道 是 哪 只 狗 吠 了 30 1, 2 张三 想 知道 李四 吠 没 吠

There are many more in the MRS testsuite. I don't know if they all are caused by the same grammar bug or relate to the same grammatical phenomenon.

These variables become evident because variable properties appear inside the ICONS list. It may be ok that these properties appear here, but they only do (with, say, the ACE parser) when the variable doesn't appear anywhere else. You can search for more examples in the testsuite with this regex: ICONS: <.*[ (you may need to escape some things differently based on your regex processor; I used vim).

— Reply to this email directly or view it on GitHub https://github.com/delph-in/zhong/issues/2.

Sanghoun Song Ph.D. in Computational Linguistics | http://corpus.mireene.com

NTU Computational Linguistics Lab. | http://compling.hss.ntu.edu.sg

goodmami commented 8 years ago

It is still used in the master branch: https://github.com/delph-in/zhong/blob/master/cmn/zhs/tsdb/gold/mrs/item#L24

Maybe this change is only in your local repository?

sanghoun commented 8 years ago

Oh! This problem!

I didn't update the gold-mrs yet. The version was created several months ago, and the grammatical modules for the MRS testsuite has been frequently changed. Anyway, let me check the unbound variables.

Sanghoun

On Mon, Jul 20, 2015 at 11:45 AM, Michael Wayne Goodman < notifications@github.com> wrote:

It is still used in the master branch:

https://github.com/delph-in/zhong/blob/master/cmn/zhs/tsdb/gold/mrs/item#L24

Maybe this change is only in your local repository?

— Reply to this email directly or view it on GitHub https://github.com/delph-in/zhong/issues/2#issuecomment-122749929.

Sanghoun Song Ph.D. in Computational Linguistics | http://corpus.mireene.com

NTU Computational Linguistics Lab. | http://compling.hss.ntu.edu.sg

sanghoun commented 8 years ago

I found the problem. The problem comes from the combination of resultative compounds and head-opt-comp. 没 is normally an adverb (a negative operator), which is spelled out as [méi]. But 没 is sometimes used as a verb, whose pronunciation is [mò]. So, when 没 吠 is analysed as a VV-compound, the problem may happen. Let me fix this problem with Zhenzhen, because this revision requires some intuition in Chinese, I think.

Sanghoun

On Mon, Jul 20, 2015 at 11:50 AM, Sanghoun Song sanghoun@uw.edu wrote:

Oh! This problem!

I didn't update the gold-mrs yet. The version was created several months ago, and the grammatical modules for the MRS testsuite has been frequently changed. Anyway, let me check the unbound variables.

Sanghoun

On Mon, Jul 20, 2015 at 11:45 AM, Michael Wayne Goodman < notifications@github.com> wrote:

It is still used in the master branch:

https://github.com/delph-in/zhong/blob/master/cmn/zhs/tsdb/gold/mrs/item#L24

Maybe this change is only in your local repository?

— Reply to this email directly or view it on GitHub https://github.com/delph-in/zhong/issues/2#issuecomment-122749929.

Sanghoun Song Ph.D. in Computational Linguistics | http://corpus.mireene.com

NTU Computational Linguistics Lab. | http://compling.hss.ntu.edu.sg

Sanghoun Song Ph.D. in Computational Linguistics | http://corpus.mireene.com

NTU Computational Linguistics Lab. | http://compling.hss.ntu.edu.sg