Kozea / WeasyPrint

The awesome document factory
https://weasyprint.org
BSD 3-Clause "New" or "Revised" License
7.14k stars 682 forks source link

HTML Generation generates an error with assert next_skip_stack is None #942

Closed sebastienlevert closed 5 years ago

sebastienlevert commented 5 years ago

I'm using WeasyPrint through multiple MkDocs plugins (https://github.com/zhaoterryy/mkdocs-pdf-export-plugin and https://github.com/comwes/mkpdfs-mkdocs-plugin) and both are failing on some HTML. The document I generate is huge but it's hard to debug "what's wrong" with my HTML.

I'm getting the following error :

Traceback (most recent call last):
  File "C:\Python37\Scripts\mkdocs-script.py", line 11, in <module>
    load_entry_point('mkdocs', 'console_scripts', 'mkdocs')()
  File "C:\Python37\lib\site-packages\click\core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "C:\Python37\lib\site-packages\click\core.py", line 717, in main
    rv = self.invoke(ctx)
  File "C:\Python37\lib\site-packages\click\core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Python37\lib\site-packages\click\core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Python37\lib\site-packages\click\core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "C:\Python37\lib\site-packages\mkdocs\__main__.py", line 163, in build_command
    ), dirty=not clean)
  File "C:\Python37\lib\site-packages\mkdocs\commands\build.py", line 298, in build
    config['plugins'].run_event('post_build', config)
  File "C:\Python37\lib\site-packages\mkdocs\plugins.py", line 94, in run_event
    result = method(item, **kwargs)
  File "c:\users\sleve\source\valo\mkpdfs-mkdocs-plugin\mkpdfs_mkdocs\mkpdfs.py", line 94, in on_post_build
    self.generator.write()
  File "c:\users\sleve\source\valo\mkpdfs-mkdocs-plugin\mkpdfs_mkdocs\generator.py", line 63, in write
    html.write_pdf(pdf_path, font_config=font_config)
  File "C:\Python37\lib\site-packages\weasyprint\__init__.py", line 211, in write_pdf
    font_config=font_config).write_pdf(
  File "C:\Python37\lib\site-packages\weasyprint\__init__.py", line 168, in render
    font_config)
  File "C:\Python37\lib\site-packages\weasyprint\document.py", line 377, in _render
    [Page(page_box, enable_hinting) for page_box in page_boxes],
  File "C:\Python37\lib\site-packages\weasyprint\document.py", line 377, in <listcomp>
    [Page(page_box, enable_hinting) for page_box in page_boxes],
  File "C:\Python37\lib\site-packages\weasyprint\layout\__init__.py", line 130, in layout_document
    pages = list(make_all_pages(context, root_box, html, pages, style_for))
  File "C:\Python37\lib\site-packages\weasyprint\layout\pages.py", line 798, in make_all_pages
    i, context, root_box, html, style_for)
  File "C:\Python37\lib\site-packages\weasyprint\layout\pages.py", line 736, in remake_page
    page_number, page_state)
  File "C:\Python37\lib\site-packages\weasyprint\layout\pages.py", line 554, in make_page
    positioned_boxes, positioned_boxes, adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 65, in block_level_layout
    adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 81, in block_level_layout_switch
    adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 135, in block_box_layout
    page_is_empty, absolute_boxes, fixed_boxes, adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 519, in block_container_layout
    adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 65, in block_level_layout
    adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 81, in block_level_layout_switch
    adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 135, in block_box_layout
    page_is_empty, absolute_boxes, fixed_boxes, adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 519, in block_container_layout
    adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 65, in block_level_layout
    adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 81, in block_level_layout_switch
    adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 135, in block_box_layout
    page_is_empty, absolute_boxes, fixed_boxes, adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 519, in block_container_layout
    adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 65, in block_level_layout
    adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 96, in block_level_layout_switch
    device_size, page_is_empty, absolute_boxes, fixed_boxes)
  File "C:\Python37\lib\site-packages\weasyprint\layout\flex.py", line 465, in flex_layout
    fixed_boxes, adjoining_margins=[]))
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 81, in block_level_layout_switch
    adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 135, in block_box_layout
    page_is_empty, absolute_boxes, fixed_boxes, adjoining_margins)
  File "C:\Python37\lib\site-packages\weasyprint\layout\blocks.py", line 382, in block_container_layout
    for line, resume_at in lines_iterator:
  File "C:\Python37\lib\site-packages\weasyprint\layout\inlines.py", line 56, in iter_line_boxes
    device_size, absolute_boxes, fixed_boxes, first_letter_style)
  File "C:\Python37\lib\site-packages\weasyprint\layout\inlines.py", line 73, in get_next_linebox
    skip_stack = skip_first_whitespace(linebox, skip_stack)
  File "C:\Python37\lib\site-packages\weasyprint\layout\inlines.py", line 206, in skip_first_whitespace
    result = skip_first_whitespace(box.children[index], next_skip_stack)
  File "C:\Python37\lib\site-packages\weasyprint\layout\inlines.py", line 192, in skip_first_whitespace
    assert next_skip_stack is None
AssertionError

Questions

  1. Is there anyway to "debug" what is going wrong
  2. Are there other configuration that could be used that would prevent WeasyPrint to fail on those HTML?

Thanks!

liZe commented 5 years ago

Hello!

Thank you for reporting this issue. Could you please provide a HTML sample that raises this error?

Tontyna commented 5 years ago

@sebastienlevert usually such an error is triggered by a very special combination of text, font-size, element sizes and nested elements.

When I attempt to figure out what happens, the first step is creating a minimal HTML/CSS-snippet that reproduces the error.

Then I run a script like the following in Pyzo, my Python IDE:

from weasyprint import HTML, LOGGER

# within Pyzo no need to add a StreamHandler to the LOGGER
LOGGER.setLevel('INFO') 

myhtml = HTML(filename=<path-to-html>)
document = myhtml.render()
document.write_pdf(target='test.pdf')

If setting breakpoints at relevant places in the weasyprint modules and inspecting the Workspace doesn't help I insert print or LOGGER.info() commands in (a copy of) the WeasyPrint sources.

sebastienlevert commented 5 years ago

This is one of the issue. It feels quite hard to find where the issue is. This is a 600+ pages file so it's hard to pinpoint / find a repro. I'll see what I can do and let you know!

Tontyna commented 5 years ago

Render the document with option --verbose to see how many pages are rendered before the crash. Might give you a hint where to cut the document.

sebastienlevert commented 5 years ago

With the logging enabled, I'm getting the following log :

@import rule " "https://fonts.googleapis.com/icon?family=Material+Icons"" not at the beginning of the the whole rule was ignored at 10:1.
Ignored `box-shadow: 0 2px 2px 0 rgba(0, 0, 0, 0.14), 0 1px 5px 0 rgba(0, 0, 0, 0.12),
    0 3px 1px -2px rgba(0, 0, 0, 0.2)` at 28:3, unknown property.
Expected a media type, got only/**/screen/**/and/**/(max-width: 44.9375em)
Invalid media type " only screen and (max-width: 44.9375em) " the whole @media rule was ignored at 615:1.
Invalid or unsupported selector 'article .codehilite pre::-webkit-scrollbar-thumb:hover,
article .highlight pre::-webkit-scrollbar-thumb:hover,
article .codehilite code::-webkit-scrollbar-thumb:hover,
article .highlight code::-webkit-scrollbar-thumb:hover ', (<LiteralToken :>, 'unpexpected literal token.')
Ignored `user-select: none` at 709:3, unknown property.
Expected a media type, got only/**/screen/**/and/**/(max-width: 44.9375em)
Invalid media type " only screen and (max-width: 44.9375em) " the whole @media rule was ignored at 724:1.
Ignored `box-shadow: none` at 739:3, unknown property.
Expected a media type, got only/**/screen/**/and/**/(max-width: 44.9375em)
Invalid media type " only screen and (max-width: 44.9375em) " the whole @media rule was ignored at 741:1.
Ignored `pointer-events: none` at 766:3, unknown property.
Ignored `pointer-events: none` at 777:3, unknown property.
Ignored `pointer-events: initial` at 806:3, unknown property.
Expected a media type, got only/**/screen/**/and/**/(max-width: 44.9375em)
Invalid media type " only screen and (max-width: 44.9375em) " the whole @media rule was ignored at 842:1.
Ignored `box-shadow: 0.25em 0 0 #fdd, -0.25em 0 0 #fdd` at 861:3, unknown property.
Ignored `box-shadow: 0.25em 0 0 #dfd, -0.25em 0 0 #dfd` at 865:3, unknown property.
Ignored `box-shadow: 0.25em 0 0 rgba(236, 236, 236, 0.5),
    -0.25em 0 0 rgba(236, 236, 236, 0.5)` at 870:3, unknown property.
Ignored `box-shadow: none` at 884:3, unknown property.
Expected a media type, got only/**/screen/**/and/**/(max-width: 44.9375em)
Invalid media type " only screen and (max-width: 44.9375em) " the whole @media rule was ignored at 983:1.
Expected a media type, got only/**/screen/**/and/**/(max-width: 76.1875em)
Invalid media type " only screen and (max-width: 76.1875em) " the whole @media rule was ignored at 1342:3.
Ignored `overflow-x: hidden` at 1394:5, unknown property.
Expected a media type, got print/**/and/**/(width: 21cm)and/**/(height: 29.7cm)
Invalid media type " print and (width: 21cm) and (height: 29.7cm) " the whole @media rule was ignored at 1405:3.
Expected a media type, got print/**/and/**/(width: 8.5in)and/**/(height: 11in)
Invalid media type " print and (width: 8.5in) and (height: 11in) " the whole @media rule was ignored at 1412:4.
Ignored `box-shadow: none` at 1482:7, unknown property.
Ignored `width: calc(25% - (34px / 3.04))` at 1570:5, invalid value.
Ignored `box-shadow: 0 8px 4px -4px #bbb` at 1604:5, unknown property.
Ignored `width: calc(33% - (34px / 2.9))` at 1609:5, invalid value.
Ignored `max-width: calc(100%-30px)` at 1611:5, invalid value.
Ignored `box-shadow: 0 8px 16px 0 rgba(0,0,0,0.2)` at 1615:5, unknown property.
Ignored `text-shadow: 2px 2px 4px #000` at 1637:5, unknown property.
Ignored `border: 1px solid var(--custom-yellow)` at 1668:5, invalid value.
Invalid or unsupported selector '.md-search__input::placeholder ', unknown pseudo-element: placeholder
Expected a media type, got screen/**/and/**/(min-width: 730px)and/**/(max-width: 76.1875em)
Invalid media type " screen and (min-width: 730px) and (max-width: 76.1875em) " the whole @media rule was ignored at 1716:3.
Expected a media type, got screen/**/and/**/(min-width: 0)and/**/(max-width: 730px)
Invalid media type " screen and (min-width: 0) and (max-width: 730px) " the whole @media rule was ignored at 1743:3.
Expected a media type, got (max-width: 76.1875em)
Invalid media type " (max-width: 76.1875em) " the whole @media rule was ignored at 1767:3.
Expected a media type, got (min-width: 76.1875em)
Invalid media type " (min-width: 76.1875em) " the whole @media rule was ignored at 1844:3.
Ignored `border-bottom: 3px solid var(--custom-yellow)` at 1995:5, invalid value.
Ignored `background: var(--custom-yellow)` at 2038:5, invalid value.
Expected a media type, got screen/**/and/**/(max-width: 76.1875em)
Invalid media type " screen and (max-width: 76.1875em) " the whole @media rule was ignored at 2073:3.
Ignored `text-align: -webkit-match-parent` at 2139:5, invalid value.
Expected a media type, got screen/**/and/**/(max-width: 76.1875em)
Invalid media type " screen and (max-width: 76.1875em) " the whole @media rule was ignored at 2159:3.
Expected a media type, got screen/**/and/**/(min-width: 76.25em)
Invalid media type " screen and (min-width: 76.25em) " the whole @media rule was ignored at 2169:3.
Expected a media type, got screen/**/and/**/(max-width: 76.1875em)
Invalid media type " screen and (max-width: 76.1875em) " the whole @media rule was ignored at 2176:3.
Expected a media type, got screen/**/and/**/(min-width: 76.1875em)and/**/(max-width: 80em)
Invalid media type " screen and (min-width: 76.1875em) and (max-width: 80em) " the whole @media rule was ignored at 2182:3.
Ignored `appearance: none` at 2198:5, unknown property.
Ignored `outline: 1px solid var(--custom-yellow)` at 2210:5, invalid value.
Expected a media type, got screen/**/and/**/(max-width: 76.1875em)
Invalid media type " screen and (max-width: 76.1875em) " the whole @media rule was ignored at 2213:3.
Ignored `border-left: 8px solid var(--custom-yellow)` at 2236:5, invalid value.
Expected a media type, got screen/**/and/**/(max-width: 76.1875em)
Invalid media type " screen and (max-width: 76.1875em) " the whole @media rule was ignored at 2253:3.
Ignored `word-break: normal` at 2260:5, unknown property.
Ignored `outline: 1px solid var(--custom-white)` at 2283:5, invalid value.
Ignored `overflow-y: hidden` at 2353:5, unknown property.
Anchor defined twice: doc-title
Anchor defined twice: mkpdf-intranet/home/versions/1
Anchor defined twice: mkpdf-intranet/home/versions/1
Anchor defined twice: mkpdf-intranet/home/versions/1
Anchor defined twice: mkpdf-intranet/home/versions/1
Anchor defined twice: mkpdf-intranet/update/updates/1
Anchor defined twice: mkpdf-intranet/update/updates/1
Anchor defined twice: mkpdf-intranet/update/updates/1
Anchor defined twice: mkpdf-teamwork/home/version-history/1
Anchor defined twice: mkpdf-teamwork/home/version-history/1
Anchor defined twice: mkpdf-teamwork/home/version-history/1
Anchor defined twice: mkpdf-teamwork/home/version-history/1
Anchor defined twice: mkpdf-teamwork/update/updates/1
Anchor defined twice: mkpdf-teamwork/update/updates/1
Anchor defined twice: mkpdf-teamwork/update/updates/1
Anchor defined twice: mkpdf-teamwork/update/updates/1
Relative URI reference without a base URI: <img src="/assets/prerequisites/no-languages-selected.png">
Relative URI reference without a base URI: <img src="/assets/user-profile-manager.png">
Relative URI reference without a base URI: <img src="/assets/add-user-profile-property-valo-rootintranet.png">
Relative URI reference without a base URI: <img src="/assets/user-profile-allow-edit.png">
Relative URI reference without a base URI: <img src="/assets/Azure-groups.png">
Relative URI reference without a base URI: <img src="/assets/running-installation/credential-manager-open.png">
Relative URI reference without a base URI: <img src="/assets/running-installation/credential-manager-menu.png">
Relative URI reference without a base URI: <img src="/assets/running-installation/credential-manager-add-generic.png">
Relative URI reference without a base URI: <img src="/assets/running-installation/credential-manager-add-office-365.png">
Relative URI reference without a base URI: <img src="/assets/running-installation/credential-manager-add-generic.png">
Relative URI reference without a base URI: <img src="/assets/running-installation/credential-manager-add-azure.png">
Relative URI reference without a base URI: <img src="/assets/running-installation/Valo.Parameters.json.png">
Relative URI reference without a base URI: <img src="/assets/termstore-management.png">
Relative URI reference without a base URI: <img src="/assets/termstore-languages.png">
Relative URI reference without a base URI: <img src="/assets/teamwork01.jpg">

I'm wondering if any of these are considered errors?

sebastienlevert commented 5 years ago

I can actually that it stops generating at page 95... Any idea how to get the "content" of the page being generated to have a better idea?

Tontyna commented 5 years ago

Maybe preview in browser, look for distinctive text around page 95. Find it in the html. Of course, cutting html without destroying the wrapping divs and the related styles, which might be required to trigger the crash, is fiddly -- the traceback looks like your document tree is rather complicated -- flex boxes involved...

sebastienlevert commented 5 years ago

It's definitely complicated. It's coming from a MkDocs generated documentation through one of their plugin (https://github.com/comwes/mkpdfs-mkdocs-plugin). I'm working on getting the HTML and figuring out what is happening.

sebastienlevert commented 5 years ago

It was finally a tabbed UI in HTML that created this issue. Thanks for the support, it was helpful and I now feel like I can hack my way around! Thanks!

liZe commented 5 years ago

It was finally a tabbed UI in HTML that created this issue.

Good to know you found a way to fix this!

Would you mind attaching the bugged HTML file? WeasyPrint shouldn't crash like that, even if there's a problem in the source file.