UniversalDependencies / UD_Danish-DDT

Creative Commons Attribution Share Alike 4.0 International
8 stars 3 forks source link

Clefts in Scandinavian #11

Open jnivre opened 8 years ago

jnivre commented 8 years ago

There seems to be an inconsistency in the treatment of clefts in Scandinavian treebanks.

det är X som/der ... VERB

Da: root(ROOT, X) nsubj(X, det) cop(X, är) acl:relcl(det, VERB)

Sv: root(ROOT, är) expl(är, det) dislocated(är, X) acl:relcl(X, VERB)

No: ???

liljao commented 8 years ago

The corresponding Norwegian analysis of clefts is:

root(ROOT, er) expl(er, det) nsubj(er,X) acl:relcl(X,VERB)

So, mostly like the Swedish analysis except for the relation of X. The analysis is described in the documentation with an example here.

jnivre commented 8 years ago

Thanks! Does the relation of X reflect its relation in the underlying unclefted sentence, so that it would be dobj in something like:

det var Pelle som jag såg

liljao commented 8 years ago

The original analysis in NDT has X as a PSUBJ (a "potential subject") and there is a corresponding POBJ relation which I guess would be used for these types of Xs, but I have not been able to dig up any example. Seeing it now, however, I think there is some loss of information in the conversion of the PSUBJ to nsubj, so a dislocated analysis might be a good choice to distinguish these from regular subjects. Does the Swedish analysis distinguish dislocated objects from subjects in any way?

A small aside, the original analysis actually distinguishes focus clefts from presentational clefts, and it is only in the latter case that the X is a PSUBJ. For the focus clefts (e.g. Det er et ubeskrivelig syn som møter ham) the X is a SPRED (subject predicative) hence gives rise to a regular copula analysis. The consequence is that it is only the presentational clefts that have the analysis outlined above. Is there a similar distinction of different types of clefts in other UD treebanks?

jnivre commented 8 years ago

Thanks, Lilja. I considered using language-specific subtypes "dislocated:nsubj" and "dislocated:dobj" but in the end decided against it, because I don't think this is what subtypes are for. Possibly, what we should do is just have "dislocated" in the basic dependencies but add an "nsubj" relation (from the VERB) in the enhanced dependencies. This is yet another issue where guidelines for basic dependencies are dependent on (future) decisions about the enhanced dependencies. I therefore thing that v2 of the guidelines need to have at least a rudimentary version of the enhanced dependencies too.

I am not aware of any treebank that draws a distinction between presentational and focus clefts. I am not even sure that I am able to draw the distinction myself. :)

jnivre commented 8 years ago

Any more thoughts on this?

liljao commented 8 years ago

I will try to conform the Norwegian data to the analysis adopted for Swedish, i.e. changing nsubj to dislocated.

jnivre commented 8 years ago

I will close this issue and open a new issue to fix Danish for the next release.

jnivre commented 8 years ago

I am inclined to say that this is a bug in UD_Danish to be fixed for version 2. Any other ideas? I would be happy to assign this issue to someone from the UD_Danish team, but I don't know who.

KennethEnevoldsen commented 1 year ago

@jnivre I can take a look at it if you wish?

tlynn747 commented 1 year ago

A bit late to the discussion... but you're welcome to look at how we handle clefts in Irish.

https://universaldependencies.org/ga/dep/csubj-cleft.html

We don't use acl:relcl because the clause is not relativising the fronted element. In Irish we can front nouns, prepositional phrases, adverbial phrases, adjectives and verbal nouns.

The trees in our Irish examples need some improvements - I'll get around to it!

nschneid commented 1 year ago

The recently implemented policy in English is advcl:relcl: https://universaldependencies.org/en/dep/acl-relcl.html#clefts

Stormur commented 1 year ago

I can also point to the minidocumentation I wrote for Latin: https://universaldependencies.org/la/dep/csubj-cleft.html (we are sharing the realtion with Irish).

The structure of all these clefts is really the same. I think it is also crucial to treat the copula as the functional element it is, and not as the root, which makes the structure intractable.

KennethEnevoldsen commented 11 months ago

related to #3