DistributedProofreaders / guiguts

Perl/Tk text editor designed for editing and formatting public domain material for inclusion at Project Gutenberg
GNU General Public License v2.0
9 stars 10 forks source link

Two long-standing AutoGenerate errors: anchor at end of footnote, consecutive closing tags #1073

Closed charliehoward4dp closed 1 year ago

charliehoward4dp commented 1 year ago

When the attached file is processed by AutoGenerate in GG 1.4.0 or later, two situations are not handled properly, resulting in at least three errors in the generated html:

On page 119, Footnote 29 ends with an anchor referencing another footnote. The generated code at and following the end of that anchor is invalid. The error is obvious, easily found, and easily corrected.

On page 268, line 12461 (in the attached file), Footnote 127 ends with a series of several consecutive closing tags. The generated code omits a closing </div>, which was difficult to track down. The error does not occur if blank lines are inserted between the tags prior to running AutoGenerate.

Probably as a consequence of one or both of these, the last few paragraphs in the Transcriber's Notes at the end of the document were not marked as paragraphs.

Note: This file also can be used to demonstrate the reason for a recent Enhancement request #1072 regarding how Footnote Fixup uses color-codes. The file contains 72 Footnotes and 42 Endnotes. The anchors referencing them are intermixed throughout the document, and some of the Endnotes reference footnotes of their own. In this document, footnotes are numbered 1-72 and are placed at the ends of the chapters that reference them. The Endnotes were in a chapter of their own in the original book, and remain so here. The Endnotes and their anchors have been renumbered as 101-142 to distinguish them from the regular Footnotes. The Endnote anchors in the original book were not in numerical sequence, and that's the case in this document. Some of the Footnotes and many of the Endnotes are over a page in length.

genbug.zip

charliehoward4dp commented 1 year ago

As stated in the edited original report above, the "consecutive tags" error at the end of footnote 127 on page 268 does not occur if a blank line is added after each of those tags, beginning with the closing R/. NOTE: This is the only occurrence of three consecutive closing tags in the entire document. There are many cases of two consecutive closing tags, and the Generator processed them successfully.

The misplaced </a> and malformed </div> at the end of footnote 29 on page 119 do not occur if a space character is added between the two adjacent right brackets.

With both of these changes made before running AutoGenerate, the resulting HTML passes the Nu HTML Checker and Tidy.

HOWEVER, the last several paragraphs in the Transcriber's Notes at the end of the document are not tagged as separate paragraphs: the last <p> occurs just before the paragraph beginning with The available copies... and the matching </p> occurs on a line of its own, just above </body>. This may be unrelated to the other two errors.

windymilla commented 1 year ago

The footnote 127 bug is a code limitation and/or documentation limitation - you are not permitted to nest block quotes inside block quotes - the rewrap can fail, and HTML generation also. The code only knows "I am inside block quote", so when it sees the inner close blockquote, it says "I'm not in block quote any more" even though it is. It's so rare that it's not worth changing the code to cope with nested levels of block quote, although TBH, it really ought to warn the user "Attempt to start blockquote within blockquote" if the user is rewrapping or Autogenerating. Hmmmph