standardebooks / tools

The Standard Ebooks toolset for producing our ebook files.
Other
1.42k stars 125 forks source link

se build crashes on regex invalid group reference #455

Closed bwhittakerb closed 3 years ago

bwhittakerb commented 3 years ago

When running se build with --kindle option, tool exits on a regex error. Advance, regular and Kobo (with -kobo option) files are still generated.

input is: se build --output-dir='/Users/brendan/Documents/epubs/TRC/dist' --kindle .

output is as follows:

  File "/Users/brendan/.local/bin/se", line 8, in <module>
    sys.exit(main())
  File "/Users/brendan/.local/pipx/venvs/standardebooks/lib/python3.9/site-packages/se/main.py", line 81, in main
    sys.exit(getattr(module, command_function)(args.plain_output))
  File "/Users/brendan/.local/pipx/venvs/standardebooks/lib/python3.9/site-packages/se/commands/build.py", line 56, in build
    se_epub.build(args.check, args.check_only, args.build_kobo, args.build_kindle, Path(args.output_dir), args.proof)
  File "/Users/brendan/.local/pipx/venvs/standardebooks/lib/python3.9/site-packages/se/se_epub.py", line 1136, in build
    build(self, run_epubcheck, check_only, build_kobo, build_kindle, output_directory, proof)
  File "/Users/brendan/.local/pipx/venvs/standardebooks/lib/python3.9/site-packages/se/se_epub_build.py", line 1155, in build
    xhtml = se.typography.hyphenate(xhtml, None, True)
  File "/Users/brendan/.local/pipx/venvs/standardebooks/lib/python3.9/site-packages/se/typography.py", line 385, in hyphenate
    output_xhtml = regex.sub(r"(<body[^>]*?>).+?<\/body>", fr"\1{result}</body>", output_xhtml, flags=regex.DOTALL)
  File "/Users/brendan/.local/pipx/venvs/standardebooks/lib/python3.9/site-packages/regex/regex.py", line 278, in sub
    return pat.sub(repl, string, count, pos, endpos, concurrent, timeout)
regex._regex_core.error: invalid group reference
acabal commented 3 years ago

Does each of your XHTML files have a <body> element?

bwhittakerb commented 3 years ago

Does each of your XHTML files have a <body> element?

[sorry I thought I sent this a couple days ago]

Yes, I checked and all of the xhtml files in the text subfolder have a body element (and closing tag)

acabal commented 3 years ago

Can you upload a zip file of your ebook?

bwhittakerb commented 3 years ago

Of course. I've attached it. (Also it has a github page.

It's not a book intended for Standard Ebooks but the tooling and guidance from SE is top notch so it's the basis on how I'm building this project. (and someday soon I intend to contribute to SE). Hopefully—if this isn't user error— it's a bug that can be fixed that will improve the tool for SE volunteers going forward.

ebookzip.zip

acabal commented 3 years ago

OK, this is because the string November \1948 in your endnotes is confusing the regex. This has been fixed in c4d1303.

bwhittakerb commented 3 years ago

That's fantastic! (and the string itself is a formatting oddity from a PDF cut and paste that I must have missed). Thank you so much for investigating and fixing this bug :)

bwhittakerb commented 3 years ago

(I left a donation as thanks)

acabal commented 3 years ago

Awesome, thanks! :)