TEIC / Stylesheets

TEI XSL Stylesheets
228 stars 124 forks source link

Order of name components no longer imposed #629

Open HelenaSabel opened 9 months ago

HelenaSabel commented 9 months ago

This PR would close #325.

Deleting the templates for forename and surname (the solution presented in the Stylesheets meeting) was problematic on its own because it created a lot of whitespace problems in the outputs. The other changes in common-core.xsl propose a better handling of whitespace (and the addition of periods after the last author in bibliographic references).

I also made a couple of corrections in the test files because I consider that things like this (from test.xml) is wrong:

<name><forename>Charles</forename><surname>Dickens</surname></name>

I therefore added a space between the forename and the surname in this case, and also in test27.xml and in the bibliographic test of Test2.

The test in Test concerning whitespace passed, but the one in Test2 failed (see changes in https://github.com/TEIC/Stylesheets/commit/75fbb0824aacb6785edb047981c47e91e733e5d3). I would like to ask the reviewers to pay special attention to this.

sydb commented 8 months ago

I also made a couple of corrections in the test files because I consider that things like this (from test.xml) is wrong:

<name><forename>Charles</forename><surname>Dickens</surname></name>

That seems perfectly correct XML to me. (And I daresay, it is probably in test.xml like that just to make sure the Stylesheets do not screw up when a name is encoded like this.)

This encoding says “This is a name; it has two components, a forename and a surname; the forename is ‘Charles’; the surname is ‘Dickens’.”. It makes no assertion about what a processor should do with that information. One quite reasonable approach is to output “Dickens, Charles”. Another is to output “Charles Dickens”. A third is to generate the URL “https://en.wikipedia.org/wiki/Charles_Dickens”. A fourth is to generate the personography key “./authors.xml#Dickens.Charles”. And of course, crunching it into “CharlesDickens” is a viable (if ugly) alternative, too.

HelenaSabel commented 8 months ago

What concerns me about that is falling again in Western-centric practises, with the Stylesheets separating or reordering components following conventions that are not universal. That’s why I think a space in the source file is relevant because (in my opinion) it indicates that in the case of the name “Charles Dickens”, the name components are not agglutinated (which is not the case in other languages).

sydb commented 8 months ago

Really good point, but I do not think it invalidates the correctness of

<name><surname>石井</surname><forename>四郎</forename></name>

(Nor the correctness of

<name> <surname>Ishii</surname> <forename>Shirō</forename> </name>

nor of

<name>
  <surname>石井</surname>
  <pc force="strong">&#x20;</pc>
  <forename>四郎</forename>
</name>

.) I do not actually know a culture in which name components are agglutinated (Turkish? Finnish?), but certainly

<name>
  <surname>FAMILY</surname>
  <pc join="both"/>
  <forename>given</forename>
</name>

is worth considering.

Boils down to what used to be thought of as data-centric vs text-centered encoding, I guess.

All that said, just because <name><forename>John</forename><surname>Lennon</surname></name> is a perfectly reasonable encoding does not mean our Stylesheets have to process it.

HelenaSabel commented 8 months ago

This PR goes back to draft, pending some improvements in the handling of bibliographic references

ebeshero commented 4 months ago

Council at VF2F 16 March 2024: We recognize that the changes here are definitely an improvement for a more culturally sensitive processing of parts of a name based on the source encoding. @joeytakeda suggests we should keep the option to access the original template. For users who work with teiGarage, we should be sure this kind of processing is available to them at command line.

@HelenaSabel should update the branch and then request reviewers look at this again to be sure it's okay to merge, and to decide whether to allow the option to access the original template.