Closed DanielHeidt closed 6 years ago
I will have a look on Tuesday.
@DanielHeidt I recall an earlier conversation about the display of <fw>
, do we want these numbers visible at all? As they interrupt the text maybe there is a better way of handling this information?
I have a vague recollection of talk of putting them in the margin.
The red in-line page numbers are placed correctly as per my discussions with @FrankFlitton . The problem is that they aren't encoded in the right place in the XML. We deliberately placed the page numbers in-line, so that readers will know exactly where in the text the page changes occurred.
This XML markup problem only occurs in some H of C records. Take, for example 5142 from the linked example above. The element should instead be placed right after the <cb/>
(my apologies for the previous reference to <hr/>
which we had used in the early transcriber workflow). This has been done correctly in other documents like http://hcmc.uvic.ca/confederation/en/lgPCLC_1865-02-09.html.
A possible script check to resolve this error could be something like:
<cb/>
and when found, check previous lines for if two <fw type>
elements appear within a few lines of each other (I've also seen occasions in the code where the two <fw type>
elements appear on the same line). If this test is true, move the second '<cb/>
.There might be other ways of doing this, but since there is a clear pattern, there is likely a way to script a fix without too much tinkering.
@lyallg please have the encoders begin to mark column numbers like page numbers in the texts from this point forward to facilitate better citations of this project's records.
XSLT to fix this, along with fixed versions of all these files, done in commit b85f8b60b. If all builds well and looks OK, I'll close this.
OK, working now. Future AB_SK docs will have to be run through the transformation in code/xslt/move_column_numbers.xsl but that's trivial. Final changes committed in #48953f1e5.
On pages like http://hcmc.uvic.ca/confederation/en/lgHC_AB_SK_1905-05-01.html, the column numbers (which act as page numbers in House of Commons Records) are being encoded as pairs, rather than with their respective columns. Is it possible to position these column numbers at the top of each column in the XML so that the better approximate the original records? Could this be done with script that looks for two numbers within an element and, when true, add the second page number immediately after the
<hr>
?