jcowey / P3

This is now a collection point for the Papyrological Publishing Platform (P3). To be found here are the files which are used to produce articles in the journal Pylon.
2 stars 3 forks source link

Encoding Suggestions #2

Closed GusRiva closed 4 years ago

GusRiva commented 4 years ago

Here are three suggestions of possible encodings. Let me explain them.

Problems

  1. We need a mechanism to embed the edition (Epidoc) in the article (heiEditions).
  2. We probably want an easy way to deliver a TEI-Epidoc file to Papyri.info and others, as this is the community standard.
  3. It would be desirable to have a TEI file for the article that is relatively easy to convert to HTML and PDF. This means using as basic a mark-up as possible and being full heiEditions compatible.
  4. HeiEditions and Epidoc are not so easy to combine in order to achieve 1, 2 and 3. Specially in the metadata section, Epidoc relies heavily on type attributes that are more or less excluded in HeiEditions. On the bright side, the transcription and edition of the texts does not need as many changes.

Solutions

As was already suggested, the edition could be included in the same file as the article, in a TEI element nested within the main TEI element. I was seduced by this idea at first, but now I think it is quite challenging, as each TEI would need its own schema (heiEditions or Epidoc), which is technically impossible, as far as I'm aware. We could encode the text of the edition with the specifications of HeiEditions, but then we would need to convert it to EpiDoc to share with the community. That is a possibility, but probably requires expanding HeiEditions considerably and/or creating complex transformations.

This is why I suggest creating two TEI files, where the main article refers to and/or virtually includes the content of the edition. There are many ways to do this, which I examine in the three proposals. This would mean that the article as an object is composed of the two files, that would ideally be kept in the same directory. All methods require creating a div for the edition, with a heiConcepts Attribute, which I think Jakub has called EmbeddedEdition.

1. Attribute corresp

The div has the attribute corresp, that points to the TEI file. In a transformation we would include the metadata, edition, translation and commentary in a heiEditions compliant format. The transformation from EpiDoc to heiEditions is easier than the other way around, because we only need a reduced version for the HTML and PDF, specially in the metadata section.

2. Element ptr

This is basically the same as 1, but instead of the attribute corresp, we use the element ptr. The disadvantage of this method is that ptr can't be a child of div, so we need to create an extra element, such as ab or p which would be deleted in the transformation for HTML and PDF.

3. Attribute source

Like 1, but the attribute source requires the inclusion of the actual content in the div. This content would be generated automatically through a transformation before publication.

Observations

  1. It could be possible to use the attribute corresp (method 1) in the same way as source (method 3), that is, including the edition in the adapted format within the body of the article.

  2. I think all methods could work, the main question is what the published TEI file should look like. We need to create a version of the file that includes the edition in the body of the article in an adapted format (like in method 3), but this could be only a process in our pipeline and does not need to be published. Method 1 and 2 are best when we just publish the article as the two files with a link between them. Method 3 works best when we want to publish the version of the article that is the actual basis for the HTML and the PDF, as well as the Epidoc file.

  3. The linking methods in these proposals would also work if both TEI elements are in the same file, but, as I said above, I do not think that is the best option.

  4. I personally like option 3, as the published object has all the information present in the article (and more) and uses the TEI elements and attributes in a very intuitive way.

  5. Why not attribute copyOf? It is not impossible, but it is more difficult to apply in this case. According to the TEI: "Any content of the current element should be ignored. Its true content is that of the element being pointed at." (source). This means that the content of the target in the Epidoc File should be included as it is in the source, without any modifications, which is not the case.

I hope these suggestions are useful. I am aware they might need to be improved and maybe there are things that I have not considered properly (like the previos work in transformation of epidoc for the papyri in DWork). Thanks and I would appreciate the feedback!