Open vphill opened 2 years ago
Just curious if this is a typo or me reading it incorrectly. In the third row of that table, the note doesn't seem to match the example. Mismatch between strong and weak.
Colloquial name | Appearance in source document | Encoding | Note |
---|---|---|---|
Soft hyphen | UTF-8 is a char- acter encoding for Unicode. | UTF-8 is a char<pc force="strong">-</pc><lb break="yes"/>acter encoding for Unicode. |
As in the first example, the use of weak as the value of force indicates that the encoder considers "character" to be a single orthographic token where the hyphen is only indicating that the word is broken across a line. The use of no as the value of break also indicates that the line break occurs inside an orthographic token (single word) which is broken across a line. |
Wow, that's a major editing error. The discussion of the value of @break
also doesn't match. Since I don't remember what it's supposed to be, I've opened an issue: https://github.com/kshawkin/Best-Practices-for-TEI-in-Libraries/issues/96
So is the goal something that looks like this.
When a government has ceased to protect the lives, liberty and
property of the people, from whom its legitimate powers are de<pc force="weak">-</pc><lb break="no" />
rived, and for the advancement of whose happiness it was insti<pc force="weak">-</pc><lb break="no" />
tuted: and so far from being a guarantee for their inestimable and
Or something more like this.
When a government has ceased to protect the lives, liberty and
property of the people, from whom its legitimate powers are de<pc force="weak">-</pc><lb break="no" />rived,
and for the advancement of whose happiness it was insti<pc force="weak">-</pc><lb break="no" />tuted:
and so far from being a guarantee for their inestimable and
Basically do you pull the remainder of the word from the following line or leave it as it is?
And I guess this would be another option based on more reading of the <lb />
When a government has ceased to protect the lives, liberty and
property of the people, from whom its legitimate powers are de<pc force="weak">-</pc>
<lb break="no" />rived, and for the advancement of whose happiness it was insti<pc force="weak">-</pc>
<lb break="no" />tuted: and so far from being a guarantee for their inestimable and
Those are all equivalent according to XML's rules about whitespace, which ignore line breaks (and multiple spaces). So you can create your XML in whatever ways helps with readability during creation, but you should always keep in mind that if you process your XML with XSLT, or use an XML-aware editor like oXygen, it might end screwing up your pretty formatting anyway.
So if I understand, that would also be the same as,
When a government has ceased to protect the lives, liberty and property of the people, from whom its legitimate powers are de<pc force="weak">-</pc><lb break="no" />rived, and for the advancement of whose happiness it was insti<pc force="weak">-</pc><lb break="no" />tuted: and so far from being a guarantee for their inestimable and
I guess the thing I hadn't been thinking about correctly with this so far is that if I am interested in preserving the lines that Gammel put on the page I will need to add the explicit <lb />
to the lines otherwise they shouldn't be assumed to be there just because they might show up in the text editor.
Exactly!
Decide and document how to deal with end of line hyphenation.
From the Documenting the American South project. "Any hyphens occurring in line breaks have been removed, and the trailing part of a word has been joined to the preceding line. " - https://docsouth.unc.edu/imls/texconst/texconst.html