getnikola / nikola

A static website and blog generator
https://getnikola.com/
MIT License
2.6k stars 444 forks source link

"lxml.etree.ParserError: Document is empty" with demo content #3679

Open IliaOzhmegov opened 1 year ago

IliaOzhmegov commented 1 year ago

Environment

Python Version: 3.10 and 3.11

Nikola Version: 8.2.3

Operating System: macOS Ventura (13.2.1)

Description:

> nikola build
Scanning posts........done!
.  render_taxonomies:output/archive.html
.  render_taxonomies:output/authors/index.html
.  render_taxonomies:output/categories/index.html
.  render_taxonomies:output/ru/archive.html
.  render_taxonomies:output/ru/authors/index.html
.  render_taxonomies:output/ru/categories/index.html
.  copy_assets:output/assets/css/theme.css
.  copy_assets:output/assets/css/nikola_rst.css
.  copy_assets:output/assets/css/nikola_ipython.css
.  copy_assets:output/assets/css/html4css1.css
.  copy_assets:output/assets/css/rst.css
.  copy_assets:output/assets/css/ipython.min.css
.  copy_assets:output/assets/css/rst_base.css
.  copy_assets:output/assets/css/baguetteBox.min.css
.  copy_assets:output/assets/js/html5.js
.  copy_assets:output/assets/js/fancydates.js
.  copy_assets:output/assets/js/gallery.min.js
.  copy_assets:output/assets/js/fancydates.min.js
.  copy_assets:output/assets/js/gallery.js
.  copy_assets:output/assets/js/baguetteBox.min.js
.  copy_assets:output/assets/js/html5shiv-printshiv.min.js
.  copy_assets:output/assets/js/justified-layout.min.js
.  copy_assets:output/assets/js/luxon.min.js
.  copy_assets:output/assets/xml/atom.xsl
.  copy_assets:output/assets/xml/rss.xsl
.  copy_assets:output/assets/css/code.css
.  render_listings:output/listings/index.html
TaskError - taskid:render_listings:output/listings/index.html
PythonAction Error
Traceback (most recent call last):
  File "/Users/iliaozhmegov/Projects/2.Blog/.venv/lib/python3.10/site-packages/doit/action.py", line 461, in execute
    returned_value = self.py_callable(*self.args, **kwargs)
  File "/Users/iliaozhmegov/Projects/2.Blog/.venv/lib/python3.10/site-packages/nikola/plugins/task/listings.py", line 178, in render_listing
    self.site.render_template('listing.tmpl', out_name, context)
  File "/Users/iliaozhmegov/Projects/2.Blog/.venv/lib/python3.10/site-packages/nikola/nikola.py", line 1509, in render_template
    doc = lxml.html.document_fromstring(data.strip(), parser)
  File "/Users/iliaozhmegov/Projects/2.Blog/.venv/lib/python3.10/site-packages/lxml/html/__init__.py", line 761, in document_fromstring
    raise etree.ParserError(
lxml.etree.ParserError: Document is empty

########################################
render_listings:output/listings/index.html <stdout>:

The same issue as the following:

  1. https://github.com/getnikola/nikola/issues/3663
  2. https://github.com/getnikola/nikola/issues/3507
  3. https://github.com/getnikola/nikola/issues/2851

There is something wrong with emoji in that theme

so I "fixed" it by replacing bootblog4 with base in conf.py. And on top of that in base/templates/gallery.tmpl I replaced 📂 with &#x1f4c2 (according to the #3663)

Kwpolska commented 1 year ago

If you use the current version of listing.tmpl with bootstrap4/bootblog4, does this issue still occur?

IliaOzhmegov commented 1 year ago

Yes, it does, but I had to remove &#x1f4c2; to make it work as well as all the other emoji in the rest themes to make it work 🤔

Kwpolska commented 1 year ago

If you had to remove the escape as well, this is a lxml bug that should probably be reported there instead of Nikola. This bug seems to be platform-specific — could you try reproducing it with some simpler lxml code and reporting the bug there?

davidak commented 1 year ago

I have a similar issue with Nikola 8.2.4. It works with Nikola v8.0.2.

https://codeberg.org/davidak/webseite/issues/95

rixx commented 1 year ago

Same issue here. Patched in a fallback value in my local code copy, because opening the blog for the first time in ages and seeing it broken didn't quite motivate me to build a real fix:

                document = lxml.html.fromstring(teaser or "<div></div>")

(in nikola/post.py, L936 or thereabouts, in the teaser_only part of Post.text().)