Closed HelenaSabel closed 3 years ago
Woo-hoo! Two minor concerns, one big problem:
@omit-xml-declaration
to "yes"
? Seems to me we would, in general, want them. But I am not much of an HTML and browser person, so there may be gotchas I don’t know about; and perhaps it is another ticket, eh?For modern HTML, I would say omitting the XML Declaration is correct. The character encoding thing is sloppy, but probably doesn't break anything. But we should fix it, because it's sloppy. As Syd says the PR should be updated to fix the expected results.
1. (Minor) Should we be setting `@omit-xml-declaration` to `"yes"`? Seems to me we would, in general, want them. But I am not much of an HTML and browser person, so there may be gotchas I don’t know about; and perhaps it is another ticket, eh?
To my understanding, the XML declaration is not needed unless the encoding is not UTF-8. I am partial to omitting it, but I need to find more solid arguments than the first one that came to mind (because it was related to having the XML declaration in files with a .php
extension, which is not the point of this). It seems more practical to address the presence or absence of the declaration in this ticket, but if it ends up needing a long discussion it might be better to divide it in two tickets?
2. (Minor) Not that it matters, but is there anything wrong with declaring the character set encoding twice?
So the issue was that that
<xsl:when>
was creating a second<meta>
element with just the encoding attribute: any browser would render it without problems but I think the repetition is invalid.3. (**Major**) As the output HTML files are changed, this breaks not only the tests in Test/ (kind of expected, they are very fragile) but also those in Test2/ (not surprising, because although they are not nearly as fragile, these tests are still comparing serializations, not trees).
For my personal testing, I manually encoded the expected changes (like adding the HTML doctype declaration) but I didn’t know how to delve into a more sustainable testing process...
I'm pretty sure the HTML validator will complain if the character encoding is declared twice. I think Saxon 10 provides one character encoding declaration, so if we're using a recent Saxon, we shouldn't have to add anything extra.
I finally finished with the testing so this PR is almost ready to be reviewed. I say almost because I had some doubts about the merging of the HMTL and HTML5 routines, therefore some work might still be needed.
At this point, there are no HTML5-specific stylesheets. Any reference to html5.xsl
was replaced with html.xsl
and of course html.xsl
and the html
folder were updated accordingly. What I haven’t done, though, was to delete the html5
folders. For instance, the html5
folders exist in different profiles, but the to.xsl
file they contain points to html.xsl
instead of html5.xsl
(see example). I am guessing that the most elegant solution would have been to delete the HTML5 references and have the html5
commands to just be aliases of the html
ones, but I am not sure I can do that on my own.
Regarding the merge in terms of the XSLT transformation itself, I had a doubt about the parameter divOffset
. This parameter being equal to 1
was present in html5.xsl
and thus I added it to html.xsl
(ec6205b). One of the changes you will see now when comparing the new expected results is that all the headings will be consistently one hierarchical step higher. I just wanted to make sure that that was the wanted behavior.
Since there indentation is quite different, the expected results are a bit hard to compare with the previous version. These are the main changes you will find:
<!DOCTYPE html>
h2
> h1
, h3
> h2
and so on as mentioned above due to the addition of the divOffset
parameter<header>
, <section>
, <nav>
...<div>
elements for <tei:front>
, <tei:body>
and <tei:back>
(as it was part of the HTML5 transformation)@lang
attribute. By reading this W3C recommendation, I understand that the attribute @xml:lang
shouldn’t be used.@mode
attribute in <math>
with @display="inline"
And I think that covers it!
@HelenaSabel This is great work. It'll be an easy thing to tweak the symlinks in the /bin folder to point the html5 one to the html one, which will then take over, I think.
The only thing I'm not sure about is dropping the xml:lang attribute. Since we don't know whether end-users will choose to serve their documents as XHTML or not, I see no problem in leaving the xml:lang attribute alongside the lang attribute, as the W3C document says; that's what I tend to do in my projects.
I don’t have an opinion on xml:lang but wanted to second Martin’s accolades. Thanks for all this work!
Janelle Jenstad, Associate Professor, Department of English, University of Victoriahttps://journals.uvic.ca/index.php/scene/ Director, Map of Early Modern Londonhttps://mapoflondon.uvic.ca/; Director, Linked Early Modern Drama Onlinehttps://lemdo.uvic.ca/ Co-Coordinating Editor, Digital Renaissance Editionshttps://digitalrenaissance.uvic.ca/ and the new Internet Shakespeare Editions Co-Applicant, The Endings Projecthttps://endings.uvic.ca/index.html Email: @.**@.>, @.**@.>, or @.**@.>
From: Martin Holmes @.> Sent: August 3, 2021 1:44 PM To: TEIC/Stylesheets @.> Cc: Subscribed @.***> Subject: Re: [TEIC/Stylesheets] update html output declarations to modern XHTML5 #487 (#499)
Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.
@HelenaSabelhttps://github.com/HelenaSabel This is great work. It'll be an easy thing to tweak the symlinks in the /bin folder to point the html5 one to the html one, which will then take over, I think.
The only thing I'm not sure about is dropping the xml:lang attribute. Since we don't know whether end-users will choose to serve their documents as XHTML or not, I see no problem in leaving the xml:lang attribute alongside the lang attribute, as the W3C document says; that's what I tend to do in my projects.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/TEIC/Stylesheets/pull/499#issuecomment-892152820, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABNLCOCB2K5JWND3FP5RQVLT3BIJRANCNFSM42OKJT2Q. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email.
@HelenaSabel This is great work. It'll be an easy thing to tweak the symlinks in the /bin folder to point the html5 one to the html one, which will then take over, I think.
Thank you @martindholmes! I thought it could be more complicated than that. I could tackle the XSLT dependencies, but I was afraid of some other type of dependencies related to the other programing languages that work here.
The only thing I'm not sure about is dropping the xml:lang attribute. Since we don't know whether end-users will choose to serve their documents as XHTML or not, I see no problem in leaving the xml:lang attribute alongside the lang attribute, as the W3C document says; that's what I tend to do in my projects.
Thank you for this comment, because I was doubtful myself (in fact, I was adding both at some point a48b53e). I changed my mind at re-reading the recommendation because the presence of both was only mentioned when serving XHTML 1. But you are absolutely right, if the pages are served as XML, @lang
does not hurt but @xml:lang
would be crucial.
I will fix these details tomorrow (first thing in the morning) and put the PR ready for review.
I did the symlink in the bin
folder. And the ll
command shows that the symlink has the proper target, but the stylesheets in the html5
folders inside profiles
are still being fired. I see three possible approaches:
profiles/default/html5
or profiles/tei/html5
. These stylesheets are calling for the html.xsl
one (because html5.xsl
no longer exists) so there is no inconsistency.profiles/default/html5/to.xsl
would be a symlink to profiles/default/html/to.xsl
transformtei
to add some kind of condition for the html5
command.Option 1 is already implemented and I could do option 2. Sadly, I cannot do option 3.
@HelenaSabel I'm surprised that the symlink from teitohtml5 to teitohtml fails, but I think a simpler approach might be just to make the profiles/tei/html5 folder a symlink to profiles/tei/html, so the old html5 files no longer exist. Then we could deprecate the teitohtml5 symlink and the html5 folder, and in a year or two, just remove them. What do others think?
Converting to draft to work on the symlinks
Because we are in refrigeration most, if not all, of us would say this PR is technically ineligible to be merged in for the upcoming release. (It is a code change, not a mere wording change.) For that reason, I am not even going to recommend it be merged now, let alone do so. But if the release techs (other than HSB) agree to it (i.e., if they do not mind taking on the risk), I think they could reasonably merge it before forking the release.
I think it might be better to wait. I will add your suggestions to the PR, @sydb, and I also need to run Test2 (even if I don't find any errors, it is likely that I will need to update the expected results of that test)
Fine with me to wait. And yes, there are Test2/ expected result errors.
I ran the same tests I ran previously, and also successfully built the Guidelines against this set of stylesheets.
My first attempt to address issue #487 (and my first contribution to the Stylesheets!)
In
html/html_param.xsl
I added the new parameters to be consistent with how the stylesheets were calling the previous parameters related to the the old doctype declaration.Besides updating the instances of
<xsl:output>
and<xsl:result-document>
that created HTML documents, I modified this line because otherwise the UTF-8 encoding was being declared twice.