Closed sydb closed 3 years ago
The perl fiddling is necessary, I assume, because the filename of the gl chapters cannot be reliably derived from an attribute of its root element, as used to-be the case before the so-called user friendly filenames were introduced. Why not retrofit those names as values of @n on the root element to save doing this sort of hackery in the future?
Nope. The perl-fiddling is to get rid of the namespaces that XSLT sticks in front of every element. (I know they are in the TEI namespace, those are the only ones I matched; I did not look for examples namespace ones; maybe I should.) It was quite a pain, actually.
The filename is obtained with base-uri(/)
. If you really only want the filename, rather than the whole path, use something like base-uri(/)!tokenize(.,'/')[last()]
.
Fix applied, seemd to be fine in my local build. Closing this ticket. If Mr. Jenkins objects, will re-open.
There are just over a dozen instances in the GLs of
<ref>
for which the value of@target
is the same as the content of the<ref>
.[1] These should almost certainly be<ptr>
instead.So I ran an XSLT program to look for them, taking white space normalization and the possibility that an ultimate character (like a
/
or a#
) might be different, but the values would be the same, anyway.[2]There are 14 of them in 7 files.[3] Since there are so few, my plan is to just edit each by hand, then re-build the HTML and see if the results are the same.
The only question is what to do when the URL only has a host, not a file. (Or, to be technically correct, has an authority component and the path component is empty.) My understanding is that
http://www.example.edu
is technically more correct thanhttp://www.example.edu/
, but I think the latter looks so much better, I am inclined to use it anyway. Thoughts?Notes
[1] Run the XPath 1.0 expression
count( //t:ref[@target=.] )
over p5.xml to test. Note that this is only looking for exact matches without space-normalization. There might be a few more if you normalize white space first or strip off the last character before comparison. [2] Just to make my life a little easier I copiedP5/Source/Guidelines/en/*.xml
andP5/Source/Specs/*.xml
into a temporary directory, and then ran the following XSLT on that directory using Saxon.The result is 879 tiny little files, which when joined with
perl -pe 's,Q.http://www.tei-c.org/ns/1.0.,,g;' /path/to/temp/output/* > /path/to/output/file.txt
gave me the end result in [3]. [3] Where they are: