Open rhdunn opened 4 years ago
When rendering the CHED
elements (e.g. the maintemp1
template on lines 740-749), the stylesheet is using <xsl:value-of select="."/>
. However, the XML (e.g. https://www.federalregister.gov/documents/full_text/xml/2019/11/01/2019-23800.xml, line 2345-2347) can contain:
<CHED H="1">Average<LI>unweighted</LI>
<LI>amount</LI>
</CHED>
This gets rendered incorrectly as:
<th id="GPOHEADERS" class="CHED">Averageunweighted
amount
</th>
The <xsl:value-of select="."/>
should be replaced with <xsl:apply-templates/>
on lines 746, and 755 (the maintemp1
and h2
named templates). This then generates the HTML:
<th id="GPOHEADERS" class="CHED">Average<span class="LI CHED-LI">unweighted</span>
<span class="LI CHED-LI">amount</span>
</th>
This then needs the following CSS to display correctly:
.CHED-LI {display:block;}
The table of contents does not get rendered by the stylesheet but is present in the HTML and PDF documents. That is, the FP
elements in the EXTRACT
element of the table of contents are not rendered. The title (a HD
element) is rendered.
This also includes other FP
elements that are not email addresses. For example, the "50.40(a) (19 respondents)" text after "Estimated average hours per response:" in https://www.federalregister.gov/documents/2019/11/01/2019-23800/changes-to-applicability-thresholds-for-regulatory-capital-and-liquidity-requirements.
The simplest fix for this is to include the following in the FP
element template:
<xsl:if test="not($fpcontent1 = 'Email:')">
<xsl:call-template name="apply-span"/>
</xsl:if>
This works, and matches the rendering of the HTML page, but does not match the rendering of the PDF document. Specifically, the sub-sections labelled A-Z are not indented relative to the sections with roman numeral numbering. That is, the FP
elements with a SOURCE
attribute set to FP1-2
are not indented. NOTE: This information is not added to the class in apply-span
, so cannot currently have a CSS indent applied to those elements.
The following FP
element template adds the SOURCE
attribute to the classes so they can be styled correctly:
<xsl:template match="FP">
<xsl:variable name="fpcontent1" select="substring(.,1,6)"/>
<xsl:choose>
<xsl:when test="$fpcontent1 = 'Email:'">
<xsl:text>Email: </xsl:text>
<xsl:variable name="fpcontent2" select="substring(.,8)"/>
<a>
<xsl:attribute name="href">
<xsl:text>mailto:</xsl:text>
<xsl:value-of select="$fpcontent2"/>
</xsl:attribute>
<xsl:value-of select="$fpcontent2"/>
</a>
</xsl:when>
<xsl:when test="./@SOURCE">
<xsl:variable name="collapseSource" select="./@SOURCE"/>
<span>
<xsl:attribute name="class">
<xsl:value-of select="name()"/>
<xsl:text> </xsl:text>
<xsl:value-of select="name(parent::*)"/>
<xsl:text>-</xsl:text>
<xsl:value-of select="name()"/>
<xsl:text> </xsl:text>
<xsl:value-of select="name(parent::*)"/>
<xsl:text>-</xsl:text>
<xsl:value-of select="$collapseSource"/>
</xsl:attribute>
<xsl:apply-templates/>
</span>
</xsl:when>
<xsl:otherwise>
<span>
<xsl:attribute name="class">
<xsl:value-of select="name()"/>
<xsl:text> </xsl:text>
<xsl:value-of select="name(parent::*)"/>
<xsl:text>-</xsl:text>
<xsl:value-of select="name()"/>
</xsl:attribute>
<xsl:apply-templates/>
</span>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
This results in the FP EXTRACT-FP EXTRACT-FP-2
class for level 1 ToC elements, and FP EXTRACT-FP EXTRACT-FP1-2
for the level 2 elements.
The A-Z elements can then be indented using the following CSS:
.EXTRACT-FP1-2 {margin-left:20pt;}
The .SUPLINF-HD3
class CSS element is missing a display:block;
style. This likely applies to other -HD3
based classes.
Thanks for letting us know about this, @rhdunn. We'll need to look into this to see that those proposed changes won't have a negative impact elsewhere.
The GPOHEADERS
and GPOH2HEADERS
ids are generated multiple times, which is invalid -- ids should be unique. Therefore, they should be classes. Specifically, the CSS should be:
.GPOHEADERS {font-weight:bold;font-size:9pt;text-align:center;border-left-style:solid;border-right-style:solid;border-width:1px;border-bottom-style:solid;border-top-style:solid;border-width:1px;border-color:black;}
.GPOH2HEADERS {font-weight:bold;font-size:9pt;text-align:center;border-left-style:solid;border-right-style:solid;border-width:1px;border-bottom-style:solid;border-top-style:solid;border-width:1px;border-color:black;}
while the maintemp1
template should be:
<xsl:template name="maintemp1">
<xsl:for-each select="CHED">
<th>
<xsl:attribute name="class">
<xsl:text>GPOHEADERS </xsl:text>
<xsl:value-of select="name()"/>
</xsl:attribute>
<xsl:apply-templates/>
</th>
</xsl:for-each>
</xsl:template>
and the h2
template should be:
<xsl:template name="h2">
<xsl:for-each select="CHED[@H=2]">
<th class="GPOH2HEADERS"><xsl:apply-templates/></th>
</xsl:for-each>
</xsl:template>
UPDATE 1: The other id="GPOHEADERS"
attributes should then be class="GPOHEADERS"
.
The headers in the GPO tables do not have leftmost/rightmost borders in both the PDF and HTML versions of the rules. This can be achieved (with the change from id
to class
) using the following CSS:
.GPOHEADERS:first-child {border-left:none;}
.GPOHEADERS:last-child, .GPOH2HEADERS:last-child {border-right:none;}
The table body borders also don't match the PDF or HTML versions. I haven't investigated this yet.
The GPOTABLE
class should have the display:table;
style instead of the display:block;
style. This is preventing the width:100%
style from having an effect.
For tables like "Table IV—Timeline for Initial Categorizations and Reporting Under the Final Rule" in https://www.federalregister.gov/d/2019-23662, some of the columns should span 2 columns, but are only spanning 1 column.
The fix for this is to add the following to the MyENT
template after the class attribute:
<xsl:if test="./@A=01"><xsl:attribute name="colspan">2</xsl:attribute></xsl:if>
Additionally, the ROW
template needs to be adjusted so that the NumOfENT
variable is changed to:
<xsl:variable name="NumOfENT" select="count(child::ENT) + count(child::ENT[@A=01])"/>
If the A
attribute can have a value other than 01
, and indicates how many additional columns to span, then the MyENT
template should have:
<xsl:if test="./@A"><xsl:attribute name="colspan"><xsl:value-of select="./@A + 1"/></xsl:attribute></xsl:if>
and the NumOfENT
variable should be:
<xsl:variable name="NumOfENT" select="count(child::ENT) + sum(child::ENT/@A])"/>
I've checked another document and it has the A
attribute set to L01
, so the following will be needed instead:
<xsl:variable name="NumOfENT" select="count(child::ENT) + count(child::ENT[@A=('01', 'L01')])"/>
and
<xsl:if test="./@A=('01', 'L01')"><xsl:attribute name="colspan">2</xsl:attribute></xsl:if>
The table cells in the HTML and PDF documents do not use hashed borders. Instead, they have solid black borders down the middle and at the bottom. This can be achieved using the following CSS:
.ENT {border-left-style:solid;border-right-style:solid;border-top-style:none;border-bottom-style:none;}
.ENT:first-child {border-left:none;}
.ENT:last-child {border-right:none;}
.ROW:last-child > .ENT {border-bottom-style:solid;}
Additional borders appear to be governed by the RUL
attribute on the ROW
element. I don't currently have styles for these, so they would need to be provided before using this change.
UPDATE 1: The tables that have TNOTE
elements will not display the bottom row border with the styles above. They need the following additional CSS:
tr:not(.ROW) > .TNOTE {border-top-style:solid;border-width:1px;padding-top:1em;}
tr:not(.ROW) + tr:not(.ROW) > .TNOTE {border-top-style:none;padding-top:3pt;}
The second style is for tables that have multiple TNOTE
elements.
UPDATE 2: The inserted MyENT
cells that pad the remaining columns need to make use of the ENT
class so that their borders are correctly styled. This requires making it a class (so the id CSS does not take precendence over the ENT class):
<xsl:attribute-set name="td-list">
<xsl:attribute name="class">MyENT ENT</xsl:attribute>
</xsl:attribute-set>
with the #MyENT
CSS rule renamed to .MyENT
.
The rules I have currently worked out are:
<xsl:attribute name="class">
<xsl:choose>
<xsl:when test="./@RUL='rn,s'">ROW-RUL-NSBAR </xsl:when>
<xsl:when test="./@RUL='n,s'">ROW-RUL-NSBAR </xsl:when>
<xsl:when test="./@RUL='s'">ROW-RUL-SBAR </xsl:when>
</xsl:choose>
<xsl:value-of select="name()"/>
</xsl:attribute>
with the corresponding CSS:
.ROW.ROW-RUL-NSBAR > .ENT, .ROW.ROW-RUL-SBAR > .ENT {border-bottom-style:solid;}
.ROW.ROW-RUL-NSBAR > .ENT:first-child {border-bottom-style:none;}
The EXPSTB
attribute on a ROW
element looks like it is applying to the colspan logic for an ENT
element. Therefore, the NumOfENT
variable in the ROW
template should be calculated as:
<xsl:variable name="expstb" select="(./@EXPSTB, '0')[1]"/>
<xsl:variable name="NumOfENT" select="count(child::ENT) + count(child::ENT[@A=('01', 'L01')]) + $expstb"/>
and the ENT
element td/@colspan
attribute as:
<xsl:choose>
<xsl:when test="../@EXPSTB and position()=1"><xsl:attribute name="colspan"><xsl:value-of select="../@EXPSTB + 1"/></xsl:attribute></xsl:when>
<xsl:when test="./@A=('01', 'L01')"><xsl:attribute name="colspan">2</xsl:attribute></xsl:when>
</xsl:choose>
With the above changes, Table 11 in https://www.federalregister.gov/d/2019-21250 is rendering almost correctly. The only issue is that on the HTML page the "Annual hours" and "Wage rate" columns don't have left/right borders between them (i.e. either side of the column with the '×' characters).
@rhdunn - thanks for the detailed feedback.
We uploaded a new version of the fedregister.xsl stylesheet that addresses the issues that you raised on Friday:
CHED
elementsFP
elements 1FP
elements 2.SUPLINF-HD3
missing display:block
Looking at and reviewing the updates from yesterday and today:
Thanks for the update. I'll take a look tomorrow.
The update looks good. Thanks.
Images and formulas (GID
and MATH
elements) are not displayed correctly, despite the images for the FR documents being available on the HTML pages. They can be rendered by using the following:
<xsl:template match="GPH/GID">
<img class="GPH-GID" src="https://s3.amazonaws.com/images.federalregister.gov/{.}/original.png" height="{concat(../@DEEP, 'px')}"/>
</xsl:template>
<xsl:template match="MATH/MID">
<img class="MATH-MID" src="https://s3.amazonaws.com/images.federalregister.gov/{.}/original.png" height="{concat(../@DEEP, 'px')}"/>
</xsl:template>
instead of the existing "Please see PDF for image/formula" messages.
See https://www.federalregister.gov/documents/2017/01/05/2016-30004/energy-conservation-program-test-procedures-for-central-air-conditioners-and-heat-pumps for an example with both images and formulae.
NOTE: This also applies to the CFR documents, but I don't know where those images are located, or if they are available online.
The rendering of the subscript/superscript elements in e.g. the Q and E variables on the https://www.federalregister.gov/documents/2017/01/05/2016-30004/energy-conservation-program-test-procedures-for-central-air-conditioners-and-heat-pumps#page-1560 document are not rendered as such in the stylesheet, but are in the HTML.
Update 1: It looks like the following CSS:
.E-52 {font-size:6pt;vertical-align:sub;}
.APP {margin-top:12pt;margin-bottom:0pt;font-weight:bolder;font-size:12pt;display:block;width:100%;text-align:center;}
.SU, .E-51, .FTREF {font-size:6pt;vertical-align:top;}
.URL {font-style:italic;}
needs to be modified to become:
.E-52, .E-54 {font-size:6pt;vertical-align:sub;}
.APP {margin-top:12pt;margin-bottom:0pt;font-weight:bolder;font-size:12pt;display:block;width:100%;text-align:center;}
.SU, .E-51, .E-53, .FTREF {font-size:6pt;vertical-align:top;}
.URL, .E-53, .E-54 {font-style:italic;}
That is, E-54
looks like an italic version of E-52
(subscript text), and E-53
looks like an italic version of E-51
(superscript text).
Those are all the issues I am aware of, although I haven't done a complete review of the XSLT rendering compared to the PDF and HTML output. As such, I don't expect to add any more rendering issues here.
In the latest update, the class="GPOHEADERS"
change in the maintemp1
template (line 764) introduced a bug -- the class attribute is duplicated. The template should be:
<xsl:template name="maintemp1">
<xsl:for-each select="CHED">
<th>
<xsl:attribute name="class">
<xsl:text>GPOHEADERS </xsl:text>
<xsl:value-of select="name()"/>
</xsl:attribute>
<xsl:apply-templates/>
</th>
</xsl:for-each>
</xsl:template>
The other changes don't have that issue.
Thank you for the feedback. We’ll take a look and adjust.
Hi,
I am using the
federalregister.xsl
stylesheet at https://www.govinfo.gov/bulkdata/FR/resources to render federal register documents (e.g. https://www.federalregister.gov/documents/2019/11/01/2019-23800/changes-to-applicability-thresholds-for-regulatory-capital-and-liquidity-requirements). This has various rendering issues compared to the HTML and PDF documents. I will detail the issues below as I investigate them.Kind regards, Reece