IUBLibTech / newton_chymistry

New version of 'The Chymistry of Isaac Newton', using XProc pipelines to generate a website based on TEI XML encodings of Newton's alchemical manuscripts, and Apache Solr as a search engine.
2 stars 0 forks source link

orig-reg containing soft-hyphen entity and lb element #38

Closed wehooper closed 5 years ago

wehooper commented 5 years ago

We used orig-reg to handle words that Newton hyphenated.

See Keynes 12, ALCH00001, the line near the top that begins "7. Lapis componitur . . ." which is a good example. http://carbon.dlib.indiana.edu:8215/newton-dev/mss/dipl/ALCH00001/

In the diplomatic, we used a &shy entity to draw the hyphen and an element to wrap the line. In the normalized, we wrapped the line then wrote the reg value as the first word. (I don't remember the details of the XSL I'm afraid.)

In 8220, the diplomatic version works like 8215, but in the normalized version the line isn't wrapping to match Newton's new line as we try to do.

Conal-Tuohy commented 5 years ago

Thanks! I had noticed this myself and hadn't got around to raising it as an issue. It certainly looks ugly where it produces these sporadic extra-long lines.

This is slightly different in P5, since the @corr attribute is replaced with a <corr> element, which can itself contain a <lb/>; so it's possible to include the position of the line break in both the original and normalized form. It seems to me I should implement this in the P5 conversion step, so that (if I understand the convention correctly) it performs the following transformation:

<orig reg="ejusdem">ejus&shy;<lb/>dem</orig>

<choice><orig>ejus-<lb/>dem</orig><reg><lb/>ejusdem</reg></choice>

Is that right?

Conal-Tuohy commented 5 years ago

Handled in the P4-to-P5 conversion, in 7b870cf4b155066a8af19fa0a1d9d660044fbbb4