Roave / DocbookTool

:books: Docbook Tool for static documentation generation from Markdown files
52 stars 4 forks source link

Too large images result in blank page #307

Closed cspray closed 3 months ago

cspray commented 1 year ago

Given an image that has a byte size over a certain amount the page renders as a blank string. You can see a demonstration of this in the fix-large-inline-image branch.

A couple things to note:

  1. I was able to see this fail with images as small as 256kb. I cannot say for certain whether this is the actual limit, only that anything >=256kb demonstrated this problem. It is possible that a number less than 256kb would also have this problem. That has not been confirmed.
  2. The problem is happening in the MarkdownToHml formatter. Before the Markdown parsing the $page->content() contains the proper image and after parsing the $page->content() is a blank string.
herndlm commented 3 months ago

experienced the same issue and started debugging it. It comes from https://github.com/michelf/php-markdown/blob/51613168d71787b0fe8472166ccbfa8d285c02cd/Michelf/MarkdownExtra.php#L1065-L1071 which returns null. so there must be some error occurring in that preg_replace..

remaining call stack:

but that's all I know so far, ran out of time

herndlm commented 3 months ago

oh man

image

😅

Ocramius commented 3 months ago

Worth reporting an upstream issue in https://github.com/michelf/php-markdown ?

This kind of regex work usually requires some optimization :)

herndlm commented 3 months ago

increasing https://www.php.net/manual/en/pcre.configuration.php#ini.pcre.backtrack-limit would be a workaround apparently

herndlm commented 3 months ago

Worth reporting an upstream issue in https://github.com/michelf/php-markdown ?

This kind of regex work usually requires some optimization :)

yeah I can do that.

herndlm commented 3 months ago

The regex that is causing issues is {(^.+?)(?:[ ]+ \{((?>[ ]*[#.a-z][-_:a-zA-Z0-9=]+){1,})[ ]*\} )?[ ]*\n(=+|-+)[ ]*\n+}mx in this case. But anyway, https://github.com/michelf/php-markdown?tab=readme-ov-file#bugs says

If you have a problem where Markdown gives you an empty result, first check that the backtrack limit is not too low by running php --info | grep pcre. See Installation and Requirement above for details.

and further up at https://github.com/michelf/php-markdown?tab=readme-ov-file#requirement

You might need to set pcre.backtrack_limit higher than 1 000 000 (the default), though the default is usually fine.

I can still try to report it, but could be that the answer is to increase that limit.

Anyway, the report: https://github.com/michelf/php-markdown/issues/399

I can try to find out how much bigger it needs to be to support e.g. ~512K images at least. In case you're willing to increase the limit as a workaround.

Ocramius commented 3 months ago

We can raise the limit in the docker image, meanwhile