gutenbergtools / ebookmaker

The Project Gutenberg tool to generate EPUBs and other ebook formats.
GNU General Public License v3.0
84 stars 18 forks source link

Online EBM doesn't treat title,author metadata according to a user's expectations #149

Closed G4OEU closed 1 year ago

G4OEU commented 1 year ago

Have just run my current project through the online eBookMaker (0.12.24) and noticed it cannot find in the CSS the title or author to insert into the Project Gutenberg boilerplate text that it wraps the epub in.

What you see is

The Project Gutenberg eBook of UnknownTitle, by UnknownAuthor

This ebook is for the use of anyone anywhere in the United States and most other parts of the world at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this ebook or online at www.gutenberg.org. If you are not located in the United States, you’ll have to check the laws of the country where you are located before using this eBook.

Title: UnknownTitle

Creator: UnknownAuthor

Release Date: January 1, 1 [EBook #10001]

Language: English

START OF THE PROJECT GUTENBERG EBOOK UNKNOWNTITLE

However the version of eBookMaker in GG 1.5.1 (0.12.23) does find that info and correctly inserts it into the PG boilerplate text which suggests that my HTML/CSS is not at fault.

I have attached the HTML as a .txt file in order to upload it here but it obviously needs renaming to have the .html filetype. the-popish-plot-for-Eric.txt

eshellman commented 1 year ago

Are you expecting that Ebookmaker should parse the title and author from the <head><title>?

Perhaps you are not supplying Online Ebookmaker with title and author (which are then passed to eboookmaker, or perhaps the online ebookmaker script is losing the metadata you supply?

G4OEU commented 1 year ago

Yes. I assume that is what 1.12.23 is doing when invoked from within Guiguts because it is correctly displaying that data in the files generated by EBM.

I have never supplied the data in the online dialogue when invoking EBM.

I should add that this behaviour is not something I've noticed before so am unsure whether it has been happening with previous versions of EBM accessed via the online script.

eshellman commented 1 year ago

Looking at the code, I'm guessing that parsing title and author from the only works for XHTML documents. Shouldn't be hard to add; I doubt it ever worked in v0.12 on non-xml HTML documents.

eshellman commented 1 year ago

when I run ebookmaker on the file on the command line, I get

Title: The Popish Plot, by John Pollock—A Project Gutenberg eBook

Creator: NA

Release Date: January 1, 0001 [EBook #3]

Language: English

I'll check to see if anything has been changed, but it seems like I can't reproduce what you see.

eshellman commented 1 year ago

Perhaps @gbnewby knows what online ebookmaker is doing

eshellman commented 1 year ago

I see that UnknownTitle and UnknownAuthor are the default values for the html form shown on the Online Ebookmaker, so if you enter nothing on that page, the title and author passed to ebookmaker will have those values. (this repo's www/index.php is a 3 year old copy - the code is not maintained here.)

What it said 3 years ago was:

Ebookmaker will try to identify author, title, encoding and eBook number from your file, IF it includes the standard Project Gutenberg metadata as found in the published collection. Otherwise, you can provide values.

The file you have provided does not include "the standard Project Gutenberg Metadata". So as far as I can tell there is nothing to fix here.

G4OEU commented 1 year ago

OK, noted. However I am unclear what "the standard Project Gutenberg metadata as found in the published collection" means.

If that means the title & author metadata that I check when I upload a project to PG then fine.

eshellman commented 1 year ago

It's just the section before the start of the text that looks like

Title: Love's Labor Lost

Author: William Shakespeare

On Jan 3, 2023, at 4:32 PM, Quentin Campbell @.***> wrote:

OK, noted. However I am unclear what "the standard Project Gutenberg metadata as found in the published collection" means.

If that means the title & author metadata that I check when I upload a project to PG then fine.

— Reply to this email directly, view it on GitHub https://github.com/gutenbergtools/ebookmaker/issues/149#issuecomment-1370245671, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHCGMOFH2RQ7ATANW2DWQ3WQSLIXANCNFSM6AAAAAATO4ARDM. You are receiving this because you commented.

gbnewby commented 1 year ago

This was more of a discussion topic. Drop me a note ("Contact us" on www.gutenberg.org or the upload.pglaf.org page) if further clarification or discussion is needed.