Closed amir-zeldes closed 5 months ago
I admit that when adding the validation rule, I assumed that it would spark discussion and possibly it would have to be loosened. Though the example you give is quite beyond my imagination :-) I have moved the issue to the docs repository because it is about precise interpretation of the guidelines (as most validation issues). The validator just tries to make sure that people can make assumptions about the data, if the assumptions follow from the documentation. The immediate impulse to introduce this rule was when I saw an orphan
attached to another orphan
(people probably forgot that the promoted orphan should not be attached via this relation).
In my understanding, the orphan
relation was introduced solely for gapping and stripping. That is, the missing node is a predicate and the orphaned dependents are its arguments or adjuncts. It typically occurs in coordination, so the promoted argument is attached to the first predicate as conj
; but the validator would also accept parataxis
(which some people use instead of coordination), root
(because the source can be in the previous sentence) and even advcl
(because subordination can generate similar situations). It seems quite plausible that a similar pattern could occur also with reparandum
, so I think I should add it. But I am not convinced that your particular example should involve the orphan
relation. The guidelines assume that if the main verb is elided and an auxiliary remains, then the auxiliary is promoted and the other dependents are attached to it as if it was the main verb; which would result in the relation nsubj(have, they)
, which you reject.
I whitelisted reparandum
in https://github.com/UniversalDependencies/tools/commit/e8178934276e8e452ebf6f2db71d09a48f316d57. Errors of this type are currently not reported in Coptic. I will leave this issue open for a while so that others can contribute to a better understanding of the orphan
relation.
Thanks! The idea of promoting the auxiliary makes a lot of sense in a language like English, where the auxiliary is itself a verb. In the Coptic example, the element in question is really just a functional auxiliary, with no chance of being used as a verb, so it seems a little stranger to promote it, rather than a core argument. In terms of parallels elsewhere in the corpus, orphan is often a dependent of the subject, which get first choice as the argument to promote - for that reason I would be inclined to keep the subject as the head and say that it governs all other dependents of the missing verb - in this case also the tense marker.
The point is not that the auxiliary can be used as a main verb but that it belongs to the same nucleus (in Tesnière's sense) as the main verb. As long as there is something left of the verb group, we prefer to let this represent the verb group so that other dependents can retain their true dependency relations (rather than "orphan"). It is for the same reason that we, for example, promote determiners to head elliptic noun phrases even though they can never head an ordinary noun phrase.
Both subject and auxiliary are dependents of the (missing) main verb - in my opinion the question is only which would we rather promote. Either way some information will be lost:
orphan
)reparandum
relation), but we retain the information about ellipsis, since its dependent will be orphan
, the same relation used in other cases where a subject stands in for a missing verbOne problem with 1. for Coptic is that auxiliaries aren't present in all tenses, so we would end up with situations in which we have ellipsis and a. we do have orphan, but aux is the head; b. we do have orphan, and nsubj is the head (since there is no aux) and c. where there is no orphan at all. Promoting the subject uniformly seems like a better choice for the data we have.
I agree that in languages where the auxiliary is a finite verb (like English) it is more intuitive to promote the auxiliary, but in this case it seems like more information would be discarded, and a very odd government pattern would result (AUX ->nsubj PRON
, which is impossible in Coptic, and with no trace of an orphan
)
@amir-zeldes : Just a note – orphan
is not a means to mark ellipsis. Most instances of ellipsis are simply lost in UD (that is, they are hidden due to promotion). The purpose of orphan is to avoid certain relations that only occur in some instances of ellipsis and that would be "too strange". So if nsubj(AUX, PRON) is "too strange" in Coptic, then this is possibly the argument in favor of orphan. (But then we would have to document it. The aux
relation is not listed in our obliqueness hierarchy, for example.)
To come back to the main topic of this issue, non-constituent conjunct are quite common with reformulations. Examples:
it is a good a very good question you said something about the about my question I think that you that we must go
If we use reparandum
for such cases, the reparandum phrase (in italic in the examples) certainly needs an orphan
relation.
@sylvainkahane I think that the three examples you gave would be solved without orphan
according to the UD guidelines. Simple promotion of one of the orphaned dependents.
det(good-4, a-3); det(question, a-5); advmod(good-7, very); amod(question, good-7); reparandum(good-7, good-4)
case(the, about-4); case(question, about-6); det(question, my); reparandum(my, the)
mark(you-4, that-3); mark(go, that-5); nsubj(go, we); reparandum(we, you)
https://github.com/UniversalDependencies/docs/issues/635#issuecomment-497458687 :
In my understanding, the
orphan
relation was introduced solely for gapping and stripping. That is, the missing node is a predicate and the orphaned dependents are its arguments or adjuncts. It typically occurs in coordination, so the promoted argument is attached to the first predicate asconj
; but the validator would also acceptparataxis
(which some people use instead of coordination),root
(because the source can be in the previous sentence) and evenadvcl
(because subordination can generate similar situations). It seems quite plausible that a similar pattern could occur also withreparandum
, so I think I should add it. But I am not convinced that your particular example should involve theorphan
relation.
If advcl
is okay, is acl
any different? If I understand correctly, it can be a subordinate clause with its own potentially gapped predicate the same as advcl
?
E.g., in Latvian saying viņš ēd tos ābolus, ko pirms tam [ēda] tārpi ('he eats the same apples, which where [eaten] by worms before that') is rather plausible.
Sounds good to me. Added acl
.
And what about other subordinate clauses - csubj
and ccomp
? Latvian sometimes just omits the verb in the subordintat clause, even if it is not explicitly repeated, but just easy to deduce from all other parts in that clause. We got sentence atjēdzos, ka bez angļu valodas nekur [netikšu] '[I] realised, that [I will get] nowere without English' in our data. For us it felt most natural to use analysis with ellipsis here, but is this appropriate for UD?
Well, perhaps all deprels that can mark incoming edges to heads of clauses make the heads technically eligible for outgoing orphan
edges? Although this example seems even further from prototypical gapping. What do others think about this (@jnivre @manning @sebschu)?
Note that I don't doubt that this actually is ellipsis; but most types of ellipsis are annotated in UD without using the orphan
relation. So I think the question is not whether it is ellipsis but rather if it is (sufficiently similar to) gapping.
Yes, this is an interesting data point that we haven't considered so far. I always consider orphan
to be appropriate when there is an elided predicate with multiple dependents and in our UDW-17 paper, we argued (like Gerdes and Kahane, 2015) that this also includes sentences with elided predicates where the predicate only appears in a preceding sentence (rather than in another clause in the same sentence).
It seems like this case is a little different since the predicate does not necessarily appear anywhere in the preceding discourse (if I understood correctly) but it still fulfills the criterion of a missing predicate with multiple dependents. So in short, yes, I think using ccomp
to attach English and orphan
to attach the complementizer and nowhere would be the right call for this sentence.
In Classical Chinese, very few orphan
occurs and it was originally cc
before stripping. For typical example "學而習" (study and practice) went "學而" in a chapter title. In this case, conj
at 學―conj
→習 gone away, and 而←cc
―習 gone orphan
. How do we do this for the validation?
This does not look like a case for orphan
to me. Even in clauses where orphan
is used, it does not replace a cc
relation. In the gapping examples in the guidelines, the promoted heads in the gapped clauses still have cc
children.
One possibility would be to simply attach the conjunction to the remaining verb, i.e., to the left: 學―cc
→而. But that would mean we do not see any ellipsis there.
If you know there is a verb missing, the standard way is to pick one of the nodes that would depend on it, and promote it as the substitute head of the clause. The clause is still connected to its parent node with the relation that holds between the two clauses, i.e., conj
. But as the clause is now represented by a substitute head node, the relation leads to this new head. In our case, only one node is left from the clause, and it is the conjunction. Therefore the conjunction will be promoted and we will have 學―conj
→而.
Thank you for your comment, @dan-zeman , and I've tried acl
to link to the "parents" of orphan
s. I understand that this is not good choice, but only two orphan
s in our problem might be resolved this time.
The constraint on the parent of orphan leaves me with a bit of a problem for cases like this:
opgesplitst in een Vlaamse en een Franstalige partij split in a Flemish and a French-speaking party
obl(opgesplitst,Vlaamse) orphan(Vlaamse,een-1) cc(en,partij) conj(Vlaamse,partij)
Should we allow configurations like this? Only alternative is see is accepting det(Vlaamse,een) (Vlaamse is an adjective)
Any suggestions?
I only noticed now that @jnivre wrote:
we promote determiners to head elliptic noun phrases even though they can never head an ordinary noun phrase.
So is that the solution here?
So is that the solution here?
Yes.
opgesplitst in een Vlaamse en een Franstalige partij
obl(opgesplitst, Vlaamse) case(Vlaamse, in) det(Vlaamse, een) conj(Vlaamse, partij) cc(partij, en) det(partij, een) amod(partij, Franstalige)
Nobody use orphan
in comparison? I think of sentence such as "Today I received the same message as you yesterday".
@sylvainkahane : I believe orphan
is used also in comparison in constructions like the one you mentioned. I think it has been discussed somewhere but I do not see the example directly in the guidelines.
A follow-up to this one: Working on the PROIEL/TOROT conversion (with @daghaug) we have cleaned up our processing of elliptic structures considerably, but we are still getting orphan error messages for various types of clausal heads with orphans.
Right now we see that elliptic headless relative clauses that are themselves subjects (not modifying any nominal) trigger the orphan error message "The parent of 'orphan' should normally be 'conj' but it is 'nsubj'", such as in this Latin example:
qui multum non abundavit et qui modicum non minoravit "(The one) who (gathered) much did not have too much, and (the one) who (gathered) little did not have too little" where we now get nsubj(abundavit,qui) and orphan(qui, multum) and ditto in the second part of the sentence.
Currently, then, such argument relative clauses are nsubj and obj in our conversion, but the error message made us wonder if they should actually be csubj and ... ccomp? As far as I can see the guidelines just assume that relative clauses always modify a nominal. What do you say, @dan-zeman?
Reposting this as I assume no one saw it, since I managed to post and then reopen the thread:
A follow-up to this one: Working on the PROIEL/TOROT conversion (with @daghaug) we have cleaned up our processing of elliptic structures considerably, but we are still getting orphan error messages for various types of clausal heads with orphans.
Right now we see that elliptic headless relative clauses that are themselves subjects (not modifying any nominal) trigger the orphan error message "The parent of 'orphan' should normally be 'conj' but it is 'nsubj'", such as in this Latin example:
qui multum non abundavit et qui modicum non minoravit "(The one) who (gathered) much did not have too much, and (the one) who (gathered) little did not have too little" where we now get nsubj(abundavit,qui) and orphan(qui, multum) and ditto in the second part of the sentence.
Currently, then, such argument relative clauses are nsubj and obj in our conversion, but the error message made us wonder if they should actually be csubj and ... ccomp? As far as I can see the guidelines just assume that relative clauses always modify a nominal. What do you say, @dan-zeman?
qui multum non abundavit et qui modicum non minoravit
I agree based on the Latin example that parent of orphan
should also be allowed to be root
(and therefore possibly also parataxis
, although that is not necessary for this example)
the error message made us wonder if they should actually be csubj
I think it should be csubj
if the following version would also have been csubj
in the Latin guidelines:
qui collegit multum non abundavit "the one who collected much did not have too much" root(abundavit) csubj(abundavit, collegit) nsubj(collegit, qui)
In that case, "qui" is just being promoted to cover for the missing relative clause subject (at least I assume that's how it would be annotated, but if that is not the case in the Latin guidelines and "qui" is seen as a matrix argument, then the elliptical version is also not a clause).
A version with a non-elliptic relative clause would be analysed as follows in our current version of the conversion script:
root(abundavit) nsubj(abundavit, collegit) nsubj(collegit, qui) obj(collegit, multum)
And surely the elliptic clause must go the same way. I'm not sure what the other Latin treebanks do (do you know, @daghaug?), but wanted to check if there is a general UD policy for headless relative clauses.
In any case, I think any clausal head type must allow orphan dependents, since ellipsis is in principle always possible, so if nsubj is allowed for headless relative clauses, nsubj must allow orphan dependents, if it must be csubj then csubj must allow orphan dependents etc.
To make sure I understand, an attempt at an English analogy:
Is this right? It does seem like a valid use case, since we wouldn't normally promote a subject or object as head of a clause when the predicate is missing. Though it is a pity we can't see that there's a free relative construction in the orphan
analysis.
nsubj(abundavit, collegit)
This seems a bit strange to me, since collegit is a verb, so I would have expected csubj
Is this right?
The English translation seems basically equivalent, except that in Latin we have a plain relative pronoun "qui", which is basically like "who" rather than "whoever". So this is more like Shakespeare's "who steals my purse steals trash", with "steals" elided in the first conjunct of a coordination.
This seems a bit strange to me, since collegit is a verb, so I would have expected csubj
Yes, I can see why. But if we do that, the next question is what to do with object relative clauses, which also occur aplenty. Should they be ccomp? Our converter now has them as obj. (They can of course have ellipsis too.)
reddite ergo quae Caesaris sunt Caesari give thus [which.nom.pl Caesar's are] to-Caesar
So in the PROIEL annotation that free relative clauses are syntactically nominal because they distribute exactly like NPs and not like clauses.
In subject position, the csubj/nsubj distinction is maybe not so important, but free relative clauses occur in other nominal positions as well.
When they occurr in object position, we would presumably have to label them ccomp if we take them as clausal. Perhaps not a disaster, but it would definitely give the impression that some verbs can take complement clauses when in fact they only take NPs (and free relative clauses).
Probably the most disturbing case would be the one where the free relative clause is the complement of a preposition as in
videbunt in quem transfixerunt
they will look at the one they pierced' (literally
they will look at
whom they pierced)
This is
obl(videbunt, transfixerunt) obj(transfixerunt, quem) case(transfixerunt, in)
If we treat free relative clauses as clausal, I guess it would have to be advcl? And the preposition would have to be considered mark?
on., 12.04.2023 kl. 08.24 -0700, skrev Amir Zeldes:
nsubj(abundavit, collegit) This seems a bit strange to me, since collegit is a verb, so I would have expected csubj Is this right? The English translation seems basically equivalent, except that in Latin we have a plain relative pronoun "qui", which is basically like "who" rather than "whoever". So this is more like Shakespeare's "who steals my purse steals trash", with "steals" elided in the first conjunct of a coordination. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Oh, I see the problem. The free relatives are treated as clauses lacking a nominal head, which is different from how we treat them in English: https://universaldependencies.org/en/dep/acl-relcl.html#free-relatives
Is it an option to treat the WH-word serving as subject as the head of the clause, and indicate the subject relation in the Enhanced Dependencies? So
nsubj(abundavit, qui) acl:relcl(qui, collegit) E:nsubj(collegit, qui) - enhanced dependency
I think either analysis is possible, and I understand the pros and cons. If this is the normal and only way to do free relatives in Latin, then my gut feeling is that what Nathan is suggesting makes the most sense. We had some similar thoughts in Coptic, but that language is more like English in that most free relatives have an explicit nominalization (something like "the one who"), and the examples with a plain relativizer (something like "who", except it's an indeclinable relativizer) are more rare, so we made those take clausal deprels. But canonically, yes, I would expect free relatives to take nominal deprels, among other things for the reasons Dag outlined above.
That's right, they are treated differently. The reason is that the case of the relative pronoun is governed by its function inside the releative clause. So if it's a downstairs object it would be accusative, as in "quem vidi, venit" (literally 'whom I saw arrived'), and it would be strange to take this accusative pronoun as the subject of "venit" (arrive) rather than the object of "vidi" (saw).
That said, we will preserve the original annotation in our source data, and we could give it up in the UD conversion for the sake consistency if there were clear rules for how to deal with free relatives, but the web page does not exactly suggest that. Basically we are following the annotation suggested for Czech in (in the case where the demonstrative is elided).
on., 12.04.2023 kl. 12.04 -0700, skrev Nathan Schneider:
Oh, I see the problem. The free relatives are treated as clauses lacking a nominal head, which is different from how we treat them in English: https://universaldependencies.org/en/dep/acl-relcl.html#free-relatives Is it an option to treat the WH-word serving as subject as the head of the clause, and indicate the subject relation in the Enhanced Dependencies? So nsubj(abundavit, qui) acl:relcl(qui, collegit) E:nsubj(collegit, qui) - enhanced dependency — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Ah, yeah there's less of a case argument to be made for English since the who/whom distinction is disappearing, though technically "whoever saw me" vs. "whomever I saw" has the same issue I guess—case is assigned by the relative clause.
I looked at the validation script now, and the permitted head relations for orphans are currently conj, parataxis, root, csubj, ccomp, advcl, acl and reparandum. If we are to continue treating relative clauses like nominals, which I would prefer for the reasons @daghaug lists, a much wider range of relations would have to be permitted (or at least be exceptions for this type of language). Apart from this we also get real examples of ellipsis at least with xcomp and dislocated.
xcomp: We have occasional examples of the type "He wanted (to go) to Jerusalem on foot" where it's clearly not the modal verb that takes the PP argument and adjunct. Old East Slavonic example: xočem na smerdy i pogubiti ě ‘we want (to go) after the peasants and kill them' (where an elliptic xcomp is coordinated with a non-elliptic one) dislocated: We currently use dislocated for for preposed correlative clauses, of the type "What he said, that we understood" and "Where you go, there we will follow", and these can of course be elliptic too. (We could do acl/advcl instead for these.)
Can I bring this to the attention of @dan-zeman because we need to know how to deal with this in the conversion?
The issue is that the validation rules for orphan enforces particular analyses on other constructions. So for the free relative clauses, we can make them nominal, but only if we take the wh-word as the head, or we have leave the wh-word where it belongs for case reasons, but only if we make them clausal. If there was a UD standard, we'd be happy to go either way, but as long as there isn't, we would really prefer to keep our analysis as is. (I could also give arguments for it, but that's really for somewhere else - I think these are nominalizing constructions, in much the same way as morphology can be nominalizing.)
So if the validation rules are not going to change, I think the best solution for us might be to take these sentences out of the converted data set until the status of free relative clauses (and the modal constructions and the correlatives, as mentioned by Hanne) is clarified. But it would be good to know soon what we should do...
There is currently no UD-wide consensus on free relatives as far as I know, and perhaps they should stay language-specific. As you have noticed, the perspective we take in Czech is different from what people do with the English data.
qui multum non abundavit et qui modicum non minoravit "(The one) who (gathered) much did not have too much, and (the one) who (gathered) little did not have too little" where we now get nsubj(abundavit,qui) and orphan(qui, multum) and ditto in the second part of the sentence.
If we assume that there are two nodes elided in each clause, 1. "the.one", and 2. "gathered", and if we also assume that this does not qualify as (similar enough to) gapping, then qui will be first promoted to the head of the relative clause (thus acquiring the acl:relcl
relation) then further promoted to the place of "the.one" (thus acquiring nsubj
). Multum will be attached to qui as obj
. Because of the missing verb in the relative clause, you get qui in the main clause even without treating free relatives as in English (while if "gathered" was present, it would be the promoted node and you would have a verb attached as nsubj
in the main clause). However, if you do treat free relatives as in English, then you already have qui in the main clause without promotion. The verb "gathered" would be attached to it as acl:relcl
. The verb is not present though, and there is only one orphaned dependent, which will be promoted and inherit the relation, i.e., you get acl:relcl(qui, multum)
. No orphan
relation will be used.
Now getting back to the first option where we did obj(qui, multum)
, assuming that this is not gapping. "Gathered" is a verb and qui and multum are its subject and object, respectively, which makes it similar to the situation that led us to define the orphan
relation for gapping. Yet it is different from gapping because there is no indication in this or the neighboring sentences that the missing verb is "gathered" – that seems purely hypothetical, based on semantics or pragmatics (while gapping is closer to syntax: you simply do not repeat the verb that is overtly present in another conjunct). If you ever add enhanced representation to the corpus, you should be ready to insert an empty node representing "gathered" and make the two nominals its arguments.
I don't know which of the options outlined above is the best one. But the double ellipsis and double promotion in this example suggests that almost anything can be the head of an orphan
relation, and the validation test may have to be abandoned. (I introduced it because people misunderstood orphan
as a general remedy that they should use every time they sense ellipsis around — while in fact it should be used only in a very restricted subset of ellipsis.) Or perhaps the test should be reclassified from an error to just a warning?
Thank you, Dan! In the original PROIEL annotation this sentence does of course have empty nodes with argument dependents, that is the point of departure for our conversion.
I think it might be nice to reclassify the test as a warning, we certainly found a lot of issues with our ellipsis handling because of those error messages.
I am sorry to come late here, but I missed that this topic also touches upon some issues in Latin annotation that we addressed in the past months (regarding IT-TB, LLCT and UDante treebanks).
@hanneme @daghaug , I invite you to take a look at the documentation pages that I wrote for free relative clauses in Latin. I think that they were not already there in April, but they appeared soon after (we had some internal discussion in our group).
Basically, following general UD criteria, we are using clausal relation (csubj
, ccomp
/xcomp
, advcl
) for "free relatives". The "double pronoun" is always an internal argument of these clauses (as commented in the guidelines, we were acting differently, but doing otherwise created weird, hardly justifiable structures - and I am actually convinced this is valid in general, not only language-specifically). Now, all these relations also take the :relcl
subrelation to distinguish/retrieve them (and thus please note that advcl:relcl
means a rather different thing in Latin than in English).
So, taking your sample sentence
qui multum non abundavit et qui modicum non minoravit
the annotation will be as follows:
csubj:relcl(
abundavit,
qui)
orphan(
qui,
multum)
qui is internally promoted as the head in that it is the subject.
I think the validator does not complain here, would it? Or does it just issue a warning?
I understand the point that these clauses are acting nominally and very much agree that they should be able to take nominal relations, and think that this should be the future direction for UD's guidelines, but for the moment this is a sensible compromise. In your conversion, probably it is easy to convert an nsubj
relation into csubj:relcl
etc. if it points to a predicate.
In Slovenian we seem to have found a case where an orphaned element also exhibits ellipsis of the clausal head. This leads to an orphaned element attaching itself to another orphaned element and triggers the validation warning "The parent of 'orphan' should normally be 'conj' but it is 'orphan'".
The example in Slovenian is given below (with an added English equivalent. The verbs in [square brackets] are added in English to emphasize the words that are not present in the original Slovenian sentence):
Prav je, da so za tak dogodek zaprli cesto, saj če jo za vsako kolesarsko dirko, jo lahko tudi za četvorko.
It is right that they closed the road for such an event, since, if they [close] it for every bicycle race, they can also [close] it for a dance event.
Both the main clause of the second conjunct as well as its clausal dependent (the if clause, which would normally be advcl) lack a verb. Thus, we analyze this as orphan(jo-17, jo-12) and orphan(jo-12, dirko) (in English this would correspond to orphan(it-26, it-17) and orphan(it-17, race) with the obj being promoted to the role of clausal head in the former case). Here is a representation of the analyzed structure in Slovenian:
There is no other option than to mark the direct object as the promoted clausal head and use the orphan relation, so we believe the validation script should not produce a warning in this case. Note that lahko is a modal adverb that functions in a similar way to the auxiliary verb can in English. However, it is formally not an auxiliary and always receives the advmod dependency relation, thus it cannot function as clausal head without creating misleading dependencies.
This is an interesting example, maybe we should show it somewhere in the guidelines. I agree with your analysis. That is why what the validator produces is a warning and not an error. (Warnings do not make your treebank invalid.)
It probably still makes sense to issue the warnings because cases like this are rare. And the validator can hardly know that this sentence is different from other cases where people attempt to chain two orphan
relations. (Actually it might be possible to infer from enhanced dependency representation if it were present; but it is not available for Slovenian as far as I know.)
In this specific case, since this is an elliptic adverbial clause inside a main elliptic clause, why can advcl
not be used (between the two jo)? The two "orphanhoods" are each in their own clause.
In this specific case, since this is an elliptic adverbial clause inside a main elliptic clause, why can
advcl
not be used (between the two jo)? The two "orphanhoods" are each in their own clause.
Because the guidelines say that orphan
should be used. The complete (pre-ellipsis) upper clause contains one obj
(jo), one obl
(tudi za četvorko), and one advcl
(če jo za vsako kolesarsko dirko). In the obliqueness hierarchy in the guidelines, the order of these three is obj > obl > advcl
. Therefore, jo is promoted and the other two are attached to it as orphan
.
OK, I see this now.
Thanks for pointing me to this, I might have to revise some things... but at the same point, I find there is something problematic about it, but it goes beyond this topic.
Hi - a recent update to the validator creates the error message in the title, however in the Coptic corpus we have an exception that looks correct to me: a reparandum consisting of two dependents whose head is missing. I'll give the example in English for simplicity:
The alternative of saying that both 'they' and 'have' are reparandum is unappealing, because there is a whole interrupted phrased ("they have") which results in a single repair. The option of treating 'they' as the subject of 'have' is not available in Coptic, since the equivalent of have is a past auxiliary which never takes the subject directly (it is
aux
). Basically there is a missing verb that would have dominated "they have", so in its absence we've promoted the subject, and treated the auxiliary as an orphan.Any suggestion is appreciated, but if there isn't a good reason to reject orphan I would suggest allowing
reparandum
as a parent oforphan
.