erc-dharma / tfc-nusantara-epigraphy

DHARMA project task force C, Nusantara epigraphic corpus
https://dharma.hypotheses.org/
Creative Commons Attribution 4.0 International
1 stars 0 forks source link

revising Sanga.xml #21

Closed arlogriffiths closed 4 years ago

arlogriffiths commented 4 years ago
danbalogh commented 4 years ago

I'm citing the questions here and responding one by one. But foremost, please read EG §5.4/Massive lacunae and §5.4/Lost copper plates - a lot of these questions are already answered there. We can do things differently if you don't like what the EG says there, but I have spent literally ages figuring out exactly what to do in these complex cases (and you did at least check and OK that at one stage), and if we change even just one detail, I'm afraid that a number of other changes will also be required. So unless you really dislike what the EG says, let's stick with that.

the preceding plate is unavailable, and the text starts in the middle of the word lavan·. Questions:

  1. Do we need to apply break="no" in pb and lb in such a case of an inscription starting in the middle of a word?

Good point, not previously covered by the EG. My intuition is that we should, but I don't insist. I think our main purpose with @break is to manage display. But who knows if we or someone might want to do something else with it one day. If you agree, I'll add this to the EG under both massive lacunae and lost copper plates.

  1. Can we used <supplied> for to restore at least the syllable la?

I would rather we did not, but it is OK if you feel you must. This is one of the reasons why I spent ages writing and rewriting those EG sections. The only way imo to properly restore such things as <supplied> is to reconstruct <pb> and <lb> elements for the adjacent lost page, which leads to all manner of problems with page and line numbering, and a lot of code that we don't really need. For this specific case, see EG §5.4/Lost copper plates, near the top of that subsection, the point beginning with "lost pages do not as a rule need to be reconstructed in your edition", its subpoints, and the next top-level point. For the general idea, see EG§5.4/Massive lacunae, near the middle of that subsection, point beginning with "when according to the above instructions it would be necessary to create a structural element"

  1. May we, when the original serial number of a plate is unknown, choose a number other than "1" for the first plate? Either a non-number such as A, B, or an estimate of where the two available plates would have been positioned within the original set, i.e. possible 3 and 4?

I very strongly suggest no. In principle, the EG allows you to use anything for a page number (EG §3.5/Genuine pages, point beginning with "if you have a good reason to do so, you may opt to use a different numbering scheme"). So if you think this is essential, go ahead. But to keep things consistent, I would prefer if you followed what the EG says for lost copper plates, i.e. assign the number "1" to the first encoded (extant or reconstructed) plate.

pra<choice><orig>s</orig><reg>s s</reg></choice>i<choice><orig>d</orig><reg>ddh</reg></choice>ayuga versus <choice><orig>prasidayuga</orig><reg>pras siddhayuga</reg></choice> Daniel, which of the two encoding options is to be preferred? — cf. EG p. 76

I have no clear answer to that; this is one of those real-life fuzzy issues that each encoder will have to decide on a case by case basis. Since the two items seem to be two unconnected orthographic issues, rather than a single cohesive dialect issue, I personally would probably prefer the first.

arlogriffiths commented 4 years ago

Dear Dan!

Thanks. I was not intending to propose any changes to EG.

  1. Yes, please add to EG. (You may want to ask Axelle if she sees any problem.)
  2. Fine, we shall only mention the restorations in apparatus.
  3. Alright, we'll stick to numbering 1, 2, etc. (It will just be a nuisance if the initial plates are ever rediscovered...)
  4. Is the fact that the first normalization straddles a word boundary (s >> s s) relevant?

@ekobastiawan : except for the last point, I think you now have what you need to produce a next revision of Sanga.xml

danbalogh commented 4 years ago

Arlo:

  1. Will do.
  2. I think it's a slight point in favour of the alternative encoding (single normalisation, the one you had in the comment). But I really don't think any clear-cut decisions can be made here, so just go for one and leave it at that. The machine won't care, since both methods yield the same pair of alternative texts. The only thing that may matter is what you want to see in the edition: a larger number of little marks that may be annoying to the human reader but will precisely identify at the first glance where you've intervened; or a smaller number of marks, less annoying, but forcing the human reader to think a tiny bit while looking for the precise spot of intervention, and maybe missing one of the two in their jubilation after finding one.
ekobastiawan commented 4 years ago

Mas Arlo,

Sanga.xml has been revised.

arlogriffiths commented 4 years ago

Thanks @ekobastiawan . In the future, you may remove the <!-- argr ... --> comments after you have treated them. I have added one new one for you to take into account. Please take into account also @danbalogh 's respose to the next question, and then close this discussion.

@danbalogh : could you please give your opinion on the question formulated in a comment to line 88 of the file? Do you see any difference between only flagging with <orig> in text, and then giving an <app> with a <note> such as I have done, and the alternative of encoding purely in-text?

danbalogh commented 4 years ago

Arlo, Eko: this is again a choice I cannot and do not want to make for you. If I was doing the encoding, I would probably prefer to encode all the works in the edition. (A single <choice> for the phrase or two for the two affected loci in it - again not a matter that can be decided objectively, but I would favour the first ) I prefer that because it seems that your normalisation is straightforward (no question about what the normalised text is) and not entirely self-evident. Encoding a choice in the edition will give the advantage that both the actual and the normalised text will eventually be searchable. Conversely, I would prefer merely flagging with <orig> in the following circumstances:

  1. without an apparatus entry, if it's a phenomenon that is very common and will be obvious to anyone familiar with the language/corpus (e.g. Sanskrit satva)
  2. with an apparatus entry if the normalisation is tentative or there are several alternatives.