natalink / mwe_noske

0 stars 0 forks source link

specifically mark the last token of the MWE #10

Open Ansa211 opened 6 years ago

Ansa211 commented 6 years ago

If we had mwe_last attribute (besides mwe_first), we could avoid some of the problems caused by queries such as [mwe_order="first"][mwe_order="cont"]{1,} that match a three-token MWE two times.

languagerecipes commented 6 years ago

yes, I mentioned it in the paper

On Tue, Dec 19, 2017 at 6:36 PM, Anša Vernerová notifications@github.com wrote:

If we had mwe_last attribute (besides mwe_first), we could avoid some of the problems caused by queries such as [mwe_order="first"][mwe_order= "cont"]{1,} http://hdl.handle.net/11346/KONTEXT-PARSEME-XC8G that match a three-token MWE two times.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/natalink/mwe_noske/issues/10, or mute the thread https://github.com/notifications/unsubscribe-auth/AHuwE0eQZ0-RylgQGzNmSk5FYdA5_upOks5tB_QHgaJpZM4RHUAJ .

Ansa211 commented 6 years ago

I have implemented this option, but the question is: should we change the values of mwe_order (up to now, there were three possible values, namely _ for items not in any MWE and first and cont for items in an MWE; they can be changed to _, first, cont and last ... or we can create a new attribute called mwe_order_new without changing the attribute which we mention in the paper. If there was not the paper, changing the behaviour would definitely be better (not many people are already used to the current behaviour). What do you think?

languagerecipes commented 6 years ago

it depends on what CQL structure you prefer. since we have assigned MWEs to ids, then there is no need to cont. That is, the intermediate tokens can be pulled out by expressing the condition over ids, i.e., a token that is assigned to a MWE id is part of a MEW, and this if it is not first or last, it is cont. so in general, the cont annotation can be inferred from other annotations.

But, if cont is presented, the CQL queries can be shortened in some cases, e.g., extract all intermediate tokens in MWEs (if that is ever used).

On Sun, Jan 14, 2018 at 5:03 PM, Anša Vernerová notifications@github.com wrote:

I have implemented this option, but the question is: should we change the values of mweorder (up to now, there were three possible values, namely for items not in any MWE and first and cont for items in an MWE; they can be changed to _, first, cont and last ... or we can create a new attribute called mwe_order_new without changing the attribute which we mention in the paper. If there was not the paper, changing the behaviour would definitely be better (not many people are already used to the current behaviour). What do you think?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/natalink/mwe_noske/issues/10#issuecomment-357521693, or mute the thread https://github.com/notifications/unsubscribe-auth/AHuwE4wiiiLnKs24WPuJC_XQJGkejOAyks5tKiU7gaJpZM4RHUAJ .

Ansa211 commented 6 years ago

Well, I have not been thinking of dividing mwe_order into mwe_first and mwe_last, even though that would also be possible (with mwe_cont marked implicitly by neither of the two attributes having a value, as you suggest). A single attribute seems to me to be easier to use. My question is more along the lines "can we change the possible values of mwe_order (just) after we have published a paper in which we describe the old behaviour? The good thing is that we could warn about this change on the poster, which would save some people the inevitable confusion. Is that enough? Or would it be better not to touch mwe_order and rather add a new attribute?

languagerecipes commented 6 years ago

I think we can take the online page as the reference and the most up to date version.

On Sun, Jan 14, 2018 at 5:34 PM, Anša Vernerová notifications@github.com wrote:

Well, I have not been thinking of dividing mwe_order into mwe_first and mwe_last, even though that would also be possible (with mwe_cont marked implicitly by neither of the two attributes having a value, as you suggest). A single attribute seems to me to be easier to use. My question is more along the lines "can we change the possible values of mwe_order (just) after we have published a paper in which we describe the old behaviour? The good thing is that we could warn about this change on the poster, which would save some people the inevitable confusion. Is that enough? Or would it be better not to touch mwe_order and rather add a new attribute?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/natalink/mwe_noske/issues/10#issuecomment-357523823, or mute the thread https://github.com/notifications/unsubscribe-auth/AHuwE1e350x-OOef94ykqgVdfD4akhP8ks5tKiyAgaJpZM4RHUAJ .