projectLEMDO / lemdoIssues

Repository for LEMDO issue tracking and related documents.
MIT License
1 stars 0 forks source link

Question about insertion of blank pages and page breaks #143

Closed JanelleJenstad closed 1 year ago

JanelleJenstad commented 1 year ago

@martindholmes: I can't figure out why there are so many lines in the html output for emdDouai_JC.xml. See photo below for the output.

Here's the XML:

<front>
      <pb n="131r"/>
      <fw type="pageNum" place="plc-right-top">131</fw>
      <titlePage rendition="rnd:centre">
        <space dim="vertical" unit="line" quantity="8" cert="low"/>
        <docTitle>
          <titlePart type="main"><hi rendition="rnd:x-large">J</hi>ulius <hi rendition="rnd:x-large"
              >C</hi>æsar</titlePart>
        </docTitle>
      </titlePage>

      <pb n="131v"/>
      <pb n="132r"/>
      <fw type="pageNum" place="plc-right-top">132</fw>
      <pb n="132v"/>

      <castList>
        <lb/>
        <label place="plc-centre">Drammatis Personæ</label>

And here is the HTML:

<div data-el="front">
--
  |  
  | <div class="page" data-empty="true">
  | </div>
  |  
  | <div class="page">
  |  
  | <div data-el="pb" data-n="131r" data-sig="131r" data-pbNum="131r"></div>
  | <span data-el="fw" data-type="pageNum" data-place="plc-right-top">131</span>
  |  
  |  
  | <div data-el="titlePage" data-rendition="#rnd_centre" class="rnd_centre">
  | <span style="display: block; line-height: 8; " data-el="space" data-dim="vertical" data-unit="line" data-quantity="8" data-cert="low"></span>
  |  
  |  
  | <div data-el="docTitle">
  |  
  |  
  | <h2><span data-el="hi" data-rendition="#rnd_x-large" class="rnd_x-large">J</span>ulius <span data-el="hi" data-rendition="#rnd_x-large" class="rnd_x-large">C</span>æsar</h2>
  | </div>
  | </div>
  |  
  | </div>
  |  
  | <div class="page" data-empty="true">
  |  
  | <div data-el="pb" class="blank" data-n="131v" data-sig="131v" data-pbNum="131v"></div>
  | </div>
  |  
  | <div class="page">
  |  
  | <div data-el="pb" data-n="132r" data-sig="132r" data-pbNum="132r"></div>
  | <span data-el="fw" data-type="pageNum" data-place="plc-right-top">132</span>
  | </div>
  |  
  | <div class="page">
  |  
  | <div data-el="pb" data-n="132v" data-sig="132v" data-pbNum="132v"></div>
  |  

<!--EndFragment-->
</body>
</html>

I think that both and are being turned into horizontal lines. It should just be

image
martindholmes commented 1 year ago

@JanelleJenstad Do you mean too many horizontal lines? If so, which one should not be there? What is "both" -- page beginnings and something else? The output looks more or less right to me, given the number of block items, page beginnings and line beginnings.

JanelleJenstad commented 1 year ago

It's just the first horizontal line which is superfluous. There's no encoding in the file that should be generating this line, AFAICS. I've left some notes in the XML file so you can see whence the problem arises. The notes are prefixed with "MH"

This is not urgent.

martindholmes commented 1 year ago

Is it possible that that horizontal line is just the divider between the non-transcriptional page title and the beginning of the actual transcription?

JanelleJenstad commented 1 year ago

@LEMDO-PM: I've looked at the XML files and tried to match <pb/> elements with lines. Would you please check emdDouai_JC and emdDouai_Mac to see if you think the number of lines rendered matches up with the number of <pb/>s?

LEMDO-PM commented 1 year ago

There seem to be a few extra horizontal lines showing up. One at the top (as you noted) and one at the beginning of the <body> element. After that, they appear to be correct. I also took a look at H5_Q1 to see if it was doing the same thing in a print play and it is (there should be one blank page after the title page, not two). We also have an extra horizontal line at the start of <back> (I looked at emd2H4_F1).

Looking in inspector view, we do have <div class="page" data-empty="true"> generating right after the <div>s for front, body, and back, so I expect that's why we're seeing extra lines. I'll take a look at the CSS, but I think that Pat added the pseudoelement to show up where there's a <div> with @class="page" in the HTML, which doesn't appear to exactly match our <pb> elements in TEI. @martindholmes, I'm curious about where the extra "page" divs are coming from, where would I look to see that? Somewhere in code/conversion?

I also think that we should add some padding above the pseudoelements so that page numbers for otherwise blank pages have some space before the following page's "Page break" line. Janelle made a note about that in emdDouai_JC on <pb n="132r"/>.

martindholmes commented 1 year ago

I think I've figured this one out, and committed a fix in rev 13044. If all is well in the eventual build from that rev, this can be closed.

martindholmes commented 1 year ago

All good I believe. Closing.