biblicalhumanities / greek-new-testament

Greek New Testament
45 stars 18 forks source link

Treedown conventions for split constituents #24

Open jonathanrobie opened 7 years ago

jonathanrobie commented 7 years ago

One obstacle to using sentence order in all sentences is finding the right way to represent the result in Treedown. I would like to propose some Treedown conventions to make that possible. Here are some cases I have noticed so far, I am sure there are others that I have missed.

I think we need to handle at least two distinct kinds of discontinuity. Postpositive conjunctions, which are frequent, imply a discontinuity both of sequence and of hierarchy. Most other instances seem to involve discontinuity of sequence only. I suggest we use two different conventions to distinguish these cases.

Postpositives (discontinuity in both hierarchy and sequence)

In the current trees, postpositives are represented out of sequence, like this:

*δὲ 
  v ἦσαν 
  p δίκαιοι 
  s ἀμφότεροι 
  adv ἐναντίον τοῦ Θεοῦ, 
  adv
    v.part πορευόμενοι 
    adv ἐν πάσαις ταῖς ἐντολαῖς καὶ δικαιώμασιν τοῦ Κυρίου 
    adv ἄμεμπτοι.

I would like to change this using two equivalent conventions, one primarily intended for easier editing, the other for clearer display. When marking down postpositives by hand, a postpositive can be flagged by prepending ** to it, as in this example:

v ἦσαν 
** δὲ 
p δίκαιοι 
s ἀμφότεροι 
adv ἐναντίον τοῦ Θεοῦ, 
adv
  v.part πορευόμενοι 
  adv ἐν πάσαις ταῖς ἐντολαῖς καὶ δικαιώμασιν τοῦ Κυρίου 
  adv ἄμεμπτοι.

This is equivalent to the following display-oriented representation:

*
  v ἦσαν 
  * δὲ 
  p δίκαιοι 
  s ἀμφότεροι 
  adv ἐναντίον τοῦ Θεοῦ, 
  adv
    v.part πορευόμενοι 
    adv ἐν πάσαις ταῖς ἐντολαῖς καὶ δικαιώμασιν τοῦ Κυρίου 
    adv ἄμεμπτοι.

If the discontinuity splits a constituent, the label of the constituent is repeated:

s Αὐτὸς 
** δὲ 
s ὁ Ἰωάνης

Which is equivalent to:

*
   s Αὐτὸς 
   * δὲ 
   s ὁ Ἰωάνης

If two distinct constituents share a label, they are distinguished using numbers: s.1, s.2, etc.

Simple discontinuity (discontinuity of sequence only)

When discontinuity involves sequence only, as in split focus or some uses of enclitics, a * before the word or constituent is used.

o Ταύτην 
v* ἐποίησεν
o ἀρχὴν τῶν σημείων
s ὁ Ἰησοῦς
adv ἐν Κανὰ τῆς Γαλιλαίας
Εἰ 
  s υἱὸς 
  *vc εἶ 
  s τοῦ θεοῦ
jonathanrobie commented 7 years ago

Luke 1:10 illustrates a case not shown in the previous examples.

Here is the current representation, which does not maintain sentence order:

image

This involves two constituents, each interrupted by the other. Let's try this with the proposed notation:

καὶ 
   s  πᾶν τὸ πλῆθος 
   v* ἦν 
   s* τοῦ λαοῦ 
   v προσευχόμενον 
   adv ἔξω 
   adv τῇ ὥρᾳ τοῦ θυμιάματος.

I'm inclined to say that is clear enough, and simple enough given the complexity of the example.

jonathanrobie commented 7 years ago

I"m going forward with this approach. Leaving open because (1) we do not have separate documentation for Treedown yet, and (2) the treebanks do not currently display discontinuity this way.