raffazizzi / TEI-TEST

This repo is for TESTING purposes only. Changes pushed here will be lost
0 stars 0 forks source link

element for text suppressed by the editor #46

Closed raffazizzi closed 9 years ago

raffazizzi commented 15 years ago

For the opposite of <supplied> where an editor wants to supply text that is not found in the source, another element is needed. We propose an element <suppressed> to indicate superfluous words or letters for example in the case of dittographies where a scribe has repeated a word or phrase.

Original comment by: schassan

raffazizzi commented 15 years ago

At first blush, these seems like a real need, and a clever, parallel solution to it. But I always think that when a need is identified, it is worth asking if there is already a method within the current TEI Guidelines to represent it, before adding new constructs to the Guidelines.

So with that in mind, I thought about the example Torsten supplies (pardon the pun) — dittographies — and I'm not sure I understand why a repeated word or phrase wouldn't be encoded with <sic> rather than a new <suppressed> construct. What about <sic> (or about dittographies) makes it unsuitable for encoding dittographies? If <sic> is appropriate for repeated letters, words, or phrases, are there other phenomena for which a new <suppressed> might be appropriate, <sic> is not?

BTW — I think the example here is an argument in favor of type= on <sic>.

Original comment by: sydb

raffazizzi commented 15 years ago

We've been discussing exactely this option in the workshop (BTW, all items submitted by me "this round" are workshop items!), but as <sic> should be paired with another element within <choice>, usually to form sic/corr pairs, this would leave us with an empty <corr> in case the encoder wants to leave something out. Such an empty element has not the same meaning as an explicit <suppressed> and besides this might not be good style of encoding. Neither will just a <sic> alone do the trick as it states only something to be taken notice of and the @type will not be as good as to make it more explicit (again).

Original comment by: schassan

raffazizzi commented 15 years ago

We've been discussing exactely this option in the workshop (BTW, all items submitted by me "this round" are workshop items!), but as <sic> should be paired with another element within <choice>, usually to form sic/corr pairs, this would leave us with an empty <corr> in case the encoder wants to leave something out. Such an empty element has not the same meaning as an explicit <suppressed> and besides this might not be good style of encoding. Neither will just a <sic> alone do the trick as it states only something to be taken notice of and the @type will not be as good as to make it more explicit (again).

Original comment by: schassan

raffazizzi commented 15 years ago

(I am not sure what workshop you are referring to, but I don't think it matters at the moment :-)

I do not understand why an empty <corr> does not have the same meaning as the suggested <suppressed>. The WWP uses an empty <corr> to mean exactly this, I think. To wit: “the correction of the associated apparently erroneous passage of text is nothing, i.e. the empty string”.

And while it is slightly cumbersome, that does not make it a bad style of encoding, and the verbosity can be handled locally with special markup if deemed necessary.

Original comment by: sydb

raffazizzi commented 15 years ago

Would the requirements expressed at the SIG Workshop be satisfied by an additional attribute on <sic> -- say @fix with values "corr" (there is a corrected reading in a <corr> element) "suppress" (the reading is to be suppressed") or "none" (no correction or suppression)?

I must say I find the idea that <corr/> means <suppressed>xxxx</suppressed> somewhat unintuitive.

Original comment by: lb42

raffazizzi commented 15 years ago

> I must say I find the idea that <corr/> means > <suppressed>xxxx</suppressed> somewhat unintuitive.

But it doesn't. <choice><sic>xxxxx</sic><corr/></choice> does.

Seems to me <suppressed>xxxx</suppressed> means “the editor (indicated by resp=, I presume) is indicating that for some processes the content ‘xxxx’ should be ignored” Seems to me the same semantics are indicated with <choice> <sic>xxxxx</sic> <corr/> </choice> which means “the editor (indicated by resp=) is indicating that for some processes the content ‘xxxx’ should be replaced by the content ‘’”.

Seems to me if <suppressed> (or <suppress>) is added to the Guidelines, it would be as syntactic sugar for <choice> with <sic> and empty <corr>. (This may not be a bad thing, of course. And the idea that <sic> would have a fix="suppress" may be more coherent syntactic sugar.)

Original comment by: sydb

raffazizzi commented 15 years ago

Why do we have an element <supplied> at all if this is syntactic sugar for <choice> <sic/> <corr>xxxxx</corr> </choice>?

Is the discussion all about preventing to invent a new element? In that case we should remove <supplied>. ;-) To be serious: You might disagree on my statement that an empty element might not be good style of encoding but this was the "result" of a discussion on tei-ms-l that <choice><sic>xxx</sic><corr/></choice> is not equal to <suppressed>xxx</suppressed>.

PS @Syd: I am talking about the Manuscript SIG meeting in London where we used on of the sessions not only to discuss things but submit "results" immediately as SourceForge items.

Original comment by: schassan

raffazizzi commented 15 years ago

Why do we have an element <supplied> at all if this is syntactic sugar for <choice> <sic/> <corr>xxxxx</corr> </choice>?

Is the discussion all about preventing to invent a new element? In that case we should remove <supplied>. ;-) To be serious: You might disagree on my statement that an empty element might not be good style of encoding but this was the "result" of a discussion on tei-ms-l that <choice><sic>xxx</sic><corr/></choice> is not equal to <suppressed>xxx</suppressed>.

PS @Syd: I am talking about the Manuscript SIG meeting in London where we used on of the sessions not only to discuss things but submit "results" immediately as SourceForge items.

Original comment by: schassan

raffazizzi commented 15 years ago

> Why do we have an element <supplied> at all if this is syntactic sugar for > <choice> > <sic/> > <corr>xxxxx</corr> > </choice>?

Ah, good question. I think the answer is because it isn't. The above construct is what I (the editor) use if I believe there is an apparent error that consists of a missing “xxxxx”. The construct <supplied>xxxxx</supplied> is what I use if I wish to insert an “xxxxx” for some other reason, typically because my source is illegible due to being damaged, a poor photocopy, or miserable handwriting, or some such.

So one could argue that <supplied reason="apparent error">xxxxx</supplied> is syntactic sugar for <choice><sic/><corr>xxxxx</corr></choice>, but that doesn't let us eliminate any of the involved elements, and I don't think it makes the encoding system simpler, myself.

> Syd: I am talking about the Manuscript SIG meeting in London … Thanks! Sorry I missed that meeting, but can only be in one place at a time. Sigh.

Original comment by: sydb

raffazizzi commented 15 years ago

How to proceed with this? I see (at least) these possibilities:

  1. We agree that <supplied> should have a counterpart and invent <suppressed>.
  2. We disagree on 1. because we think sic/corr does the trick. In that case we extend the documentation accordingly to have <choice><sic>xxx</sic><corr/></choice> instead.
  3. We disagree on 1. but neither does sic/corr fulfill the need. In that case we add attributes to sic/corr to make more explicit what is meant, such as @reason and att.editLike for <sic>, maybe even the @fix proposed by Lou? I wonder if in this case it would be possible to have other attributes on <corr> in order to make <supplied> superfluous.

Any more obvious possibilities I overlooked?

PS: Sorry for the double entries, but reloading the tracker overview right after submitting a comment adds the same comment again. Is there a possibility to delete double entries?

Original comment by: schassan

raffazizzi commented 15 years ago

Original comment by: lb42

raffazizzi commented 15 years ago

In EpiDoc (P4) we have been using <sic> for erroneously engraved letters (that the editor believes should be entirely suppressed). This involves the assumption that sic alone has a slightly different meaning from sic inside of choice and alongside corr, which is maybe a bit clunky. An element <suppress> or <superfluous> would be nice to improve this.

(For one thing, it would free up <sic> alone for the other use that seems intuitive for it: text that is messed up but the editor chooses not to correct.)

I vote -1 on empty <corr/> by the way. For the same reason that empty <sic/> would not be a nice way to handle <supplied reason="omitted">...

Original comment by: gabrielbodard

raffazizzi commented 15 years ago

Is this a special kind of <gap> maybe?

Original comment by: nobody

raffazizzi commented 15 years ago

FWIW, I'd vote to use <sic> with @reason='superfluous'. The intended use after all fits exactly the definition of sic: 'contains text reproduced although apparently incorrect or inaccurate'. Stylesheets can then decide when to show the superfluous text, and an empty <corr> or @fix are unnecessary.

<suppressed> seems a misnomer, because the text is still in the xml. <suppress> would be more correct.

Original comment by: pboot

raffazizzi commented 15 years ago

If gap had been called omitted, would this problem exist? It seems to me that the description of gap fits what we mean here, and it is the janus pair of supplied. The problem is that the natural language semantics of gap suggest a lack of agency.

Original comment by: sf_user_dpod

raffazizzi commented 15 years ago

>It seems to me that the description of gap fits what we mean here ...

<gap> is empty, except for glossLike elements. Is that the way the proposed <suppressed> element is going to be used? If so, <gap> might be appropriate. But it is not how I understood the preceding discussion.

Original comment by: pboot

raffazizzi commented 15 years ago

I think that <gap/> is the Right Answer (tm). The trouble is that people will want to put some content in it, despite the absurdity of transcribing something and then marking it as having not been transcribed.

Original comment by: lb42

raffazizzi commented 15 years ago

Parallel with <supplied> is accepted, more or less. <sic> is for a text reproduced tho considered incorrect or inaccurate; <gap> for text which is suppressed by editor. <omit> is for text reproduced but editor thinks should be suppressed. Some discussion about possible need for "meet to be deleted" element e.g. in anonymizing transcription. To be referred for further discussion

Original comment by: nobody

raffazizzi commented 14 years ago

Suppressed text: summary of position

https://sourceforge.net/tracker/?func=detail&aid=2242434&group\_id=106328&atid=644065

The request is for an element to indicate text that the editor wants to mark as superfluous in the source text. It is perhaps unhelpful to compare this to "supplied", since it is a different kind of editorial intervention. (Supplied text is restored by the editor to indicate damage, error, or some other cause of omission of original text--it is relatively agnostic as to why the text was lost/omitted.)

In my field, this criterion of editorial markup is clearly recognised: the Leiden Conventions use curly braces to represent this phenomenon ("litterae errore adiectae quas editor expunxit", e.g. dedika{ra}runt [citing Panciera 1980]). Currently EpiDoc--which aims to match Leiden concepts 1-to-1 with TEI patterns, recommends the use of "sic" for this distinction (see http://www.stoa.org/epidoc/gl/dev/erroneousinclusion.html which is somewhat out of date but current for this purpose).

Elena supplies a different kind of example: <l n="4">a darmi morte, poi m'avete preso <omit>a tradimento</omit></l> <l n='5'>sì com' l'uccellator prende l'uccello</l> <gap/> <l n="43">e lettere dintorno che diriano <omit>in questa guisa</omit></l> <l n="44">Più v'amo, dëa, che non faccio Deo</l>

Where the text marked here with "omit" is identified by the editor (Contini 1960) as interpolated and doesn't fit in the meter; it is marked in the printed text with a smaller font. This is a clear case where "sic" is not appropriate, since it is not athetized by the editor because of scribal error, but for another reason. Nevertheless, it seems preferable to mark these two things with the same element.

"Sic" alone is not unambiguous, despite the EpiDoc recommendation cited above:

Eleph<sic>eph</sic>ant or Eleph<choice><sic>eph</sic><corr/></choice>ant

could mean either that the second "eph" is (a) included in error, and marked as superfluous by the editor, or (b) is marked by the editor as an error, but with no statement as to what the correct form should be. This situation could be solved with an attribute (which wouldn't solve the case with interpolated verses), or a new element (which would).

A final comment/recommendation re the element name: this is not a "gap" (which marks text missing from the edition, and therefore is empty element), but rather an editorial intervention marking text as superfluous. I don't much like the name "omit", despite its claimed parallelism with "supplied", since it seems to be a processing instruction rather than a description of what is in the text. (As noted above, "supplied" is a description of how this text gets into the edition when it is not in the source text. Our superfluous text may not be omitted--as in the verse example above in a smaller font, or the Leiden example in curly braces.) I would prefer something with the semantics of "superfluous", rather than "omit", "suppress", "expunxit"/"expunged" or the like. Might <superfl> be a bit less of an eyeful?

Notes: Panciera 1980 = Hans Krummrey & Silvio Panciera, 'Criteri di edizione e segni diacritici', Tituli 2 (1980), 205-215. Contini 1960 = Poeti del Duecento, ed. Gianfraco Contini. Milano-Napoli:Ricciardi I, 155-64.

Original comment by: gabrielbodard

raffazizzi commented 14 years ago

Implemented at version 6933, using the name <surplus> for the element

Original comment by: lb42

raffazizzi commented 14 years ago

Original comment by: lb42