TEIC / Stylesheets

TEI XSL Stylesheets
231 stars 124 forks source link

docx2tei conversion needs to differentiate between list/@type and list/@rend #465

Closed rvdb closed 2 years ago

rvdb commented 3 years ago

I've noticed how the docx2tei conversion still outputs renditional features of lists as the list/@type atribute, e.g.:

<list type="ordered">
  <item>...</item>
</list>

Hence, this processing is out-of-sync with the Guidelines, which since release 2.7.0 differentiate between @type and @rend (see https://sourceforge.net/p/tei/bugs/460). The correct output for an ordered list should be:

<list rend="ordered">
  <item>...</item>
</list>

I'll propose a PR.

martindholmes commented 3 years ago

I think it should be rend="numbered" rather than ordered.

rvdb commented 3 years ago

@martindholmes, I see the TEI prose indeed uses numbered.

In that case (but unrelated to this issue), the jTEI customization should probably be aligned with this as well: it currently uses ordered for numbered lists, see https://github.com/TEIC/TEI/blob/dev/P5/Exemplars/tei_jtei.odd#L2486.

hcayless commented 3 years ago

👍All lists are ordered. They can’t help it. 😁

sydb commented 3 years ago

(Although to be fair, the order may be unimportant to you, for various values of “you”.)

lb42 commented 3 years ago

While it is true that lists are inevitably ordered, the order of a list may or may not be meaningful.

martindholmes commented 3 years ago

@rvdb Yes indeed, we should fix that. What are the implications for older already-encoded JTEI articles? Does it matter at all? We can always leave processing in place for "ordered" alongside "numbered".

rvdb commented 3 years ago

@martindholmes I've updated the pull request at https://github.com/TEIC/Stylesheets/pull/466 accordingly.

Concerning jTEI articles: replacing "ordered" with "numbered" will break validation of older articles. At first sight, I'd be inclined to keep processing for "ordered" as well, so the transformations can be re-run for older articles without changing the source. OTOH, in the meantime processing has probably changed so much in other respects that the output will be different from the published articles anyway. The safest way of producing identical output would be to re-run the transformations with a version of the scripts at publication date.

martindholmes commented 3 years ago

@rvdb Older articles can be pointed at older versions of the schema in the Vault, like this:

https://tei-c.org/Vault/P5/3.6.0/xml/tei/custom/schema/relaxng/tei_jtei.rng

so we can ensure that none are actually invalid.

rvdb commented 3 years ago

(apologies if this is becoming a different issue) @martindholmes Thanks, of course it needn't be any more complex than that. Yet, how to deal with "upcoming" changes to the jTEI ODD? Suppose this ODD change is made today in the dev Git branch, and tomorrow an article is published using the updated "numbered" @rend value for numbered lists; what schema version should it point to? In other words, would this mean that each TEI release require an update of articles that had so far been valid to a development version of the jTEI ODD?

martindholmes commented 3 years ago

Hi @rvdb. I don't think any jTEI article would be encoded using a dev version of the Guidelines, unless it's expected to be released after the current dev channel becomes a release. Otherwise, we'd use the current release of P5 and the current release of the jTEI ODD/schema to encode any new articles; then following encoding and publication, we might pin those articles to the release we used. (We might also pin them to the Stylesheets release used to generate the PDF etc., since that might change too, if we expect we're going to need to regenerate them.) If we do want or need to use dev-branch schemas or Stylesheets for an article, we can temporarily point it to a Jenkins version of the schema, and/or build it using the bleeding-edge Oxygen plugin, but with the expectation that when dev becomes release, we'd just update those pointers (and perhaps rebuild the outputs one more time).

I wouldn't expect this to happen very often, since the jTEI schema is pretty stable.