Roave / DocbookTool

:books: Docbook Tool for static documentation generation from Markdown files
52 stars 4 forks source link

Strip cache-busting parts of local image references #365

Closed herndlm closed 4 months ago

herndlm commented 4 months ago

Just a suggestion, I didn't check yet if this is a common thing in Markdown. But for some reason we have references like

![basic example 1](images/datatables/basic-example-1.png?1)

and that breaks the tool with the following exception

[2024-02-26T17:25:15.046117+00:00] cli.DEBUG: [Roave\DocbookTool\Formatter\InlineExternalImages] Inlining image "/docs-src/book/images/datatables/basic-example-1.png?1" in page "datatables" [] []
PHP Warning:  file_get_contents(/docs-src/book/images/datatables/basic-example-1.png?1): Failed to open stream: No such file or directory in /app/vendor/thecodingmachine/safe/generated/filesystem.php on line 273
PHP Fatal error:  Uncaught Safe\Exceptions\FilesystemException: file_get_contents(/docs-src/book/images/datatables/basic-example-1.png?1): Failed to open stream: No such file or directory in /app/vendor/thecodingmachine/safe/generated/Exceptions/FilesystemException.php:9
Stack trace:
#0 /app/vendor/thecodingmachine/safe/generated/filesystem.php(276): Safe\Exceptions\FilesystemException::createFromPhpError()
#1 /app/src/Formatter/InlineExternalImages.php(46): Safe\file_get_contents()
#2 [internal function]: Roave\DocbookTool\Formatter\InlineExternalImages->Roave\DocbookTool\Formatter\{closure}()
#3 /app/src/Formatter/InlineExternalImages.php(35): preg_replace_callback()
#4 /app/src/Formatter/AggregatePageFormatter.php(22): Roave\DocbookTool\Formatter\InlineExternalImages->__invoke()
#5 [internal function]: Roave\DocbookTool\Formatter\AggregatePageFormatter->__invoke()
#6 /app/bin/docbook-tool.php(57): array_map()
#7 /app/bin/docbook-tool.php(63): Roave\DocbookTool\{closure}()
#8 /app/bin/docbook-tool(6): require_once('...')
#9 {main}
  thrown in /app/vendor/thecodingmachine/safe/generated/Exceptions/FilesystemException.php on line 9

if you don't mind I could open a PR that adds "support" for this.

Ocramius commented 4 months ago
![basic example 1](images/datatables/basic-example-1.png?1)

This assumes the image is locally served, rather than a file.

I'm unsure if this should be considered: it kinda mixes the concepts of filesystem paths and URIs.

Perhaps feasible if we move from file_get_contents() to always converting paths to file://, and then using an abstraction to read them? :thinking:

herndlm commented 4 months ago

Ah yeah, I remember while we did that. The existing markdown, I used to play around with the tool, is already being converted to HTML for our app and then linked inside it. And there is a CDN like thing sitting in front of the app that would cache the generated images, when being accessed with the same URL. So we just added ?1, ?2 and so on to the local image links in markdown to force showing the latest versions in the HTML docs. I guess Markdown doesn't care about that and just passes things through, at least I could not find any more information.

Not sure if I understood correctly what you're suggesting to change here. I just checked InlineExternalImages and it looks like it was designed to only ever work with local existing images and would fail with e.g. HTTP URLs because the image path is made absolute. Prefixing those paths with file:// might be a good idea anyway, but the ?1 like suffixes still need to be stripped away somehow, no? What kind of abstraction are you thinking of and how would it help here? Still trying to figure out if I can be of help here. I do understand that this is a weird edge case and am also OK with closing this.

Ocramius commented 4 months ago

but the ?1 like suffixes still need to be stripped away somehow, no?

Not sure how file://foo/bar?baz is handled on filesystem lookups :D

herndlm commented 4 months ago

but the ?1 like suffixes still need to be stripped away somehow, no?

Not sure how file://foo/bar?baz is handled on filesystem lookups :D

badly xD

PHP Fatal error:  Uncaught Safe\Exceptions\FilesystemException: file_get_contents(file://Users/herndlm/REDACTED/docs/images/analytics-ief/select-sub-options.png): Failed to open stream: no suitable wrapper could be found in /Users/herndlm/REDACTED/DocbookTool/vendor/thecodingmachine/safe/generated/Exceptions/FilesystemException.php:9

the correct one is file:///, but still not useful in this case

PHP Fatal error:  Uncaught Safe\Exceptions\FilesystemException: file_get_contents(file:///Users/herndlm/REDACTED/docs/images/datatables/basic-example-1.png?1): Failed to open stream: No such file or directory in /Users/herndlm/REDACTED/DocbookTool/vendor/thecodingmachine/safe/generated/Exceptions/FilesystemException.php:9

I was thinking about using parse_url() to keep only scheme and path and drop the rest maybe?

Ocramius commented 4 months ago

I wouldn't expect file_get_contents() to work, but is there no way to load that file with anything else? Library? HTTP client? curl? :thinking:

herndlm commented 4 months ago

the file exists locally, it's not served by a web server, sorry if I wasn't clear enough about this. the ? args don't make any sense for a file path I know, but they end up in the generated HTML and are useful there in this special case

Ocramius commented 4 months ago

Yes, but browsers seem to handle these just fine.

Just tried locally with file:///home/ocramius/Downloads/image.png?foo=bar#stuff

herndlm commented 4 months ago

Ah I think I'm getting you now 😅

yeah, looks like using an HTTP client or cURL instead of file_get_contents() works just fine. 👍 combined with the file:// protocol that might be a good solution I guess?

Ocramius commented 4 months ago

Yep, if feasible, of course :)

herndlm commented 4 months ago

Got it to work with cURL directly, but not with an abstraction like Guzzle yet, even when it uses cURL, not entirely sure yet why :/ preferably I would like to re-use a single client and inject it via constructor maybe hmm

UPDATE: I think I have an idea, might create a PR later and maybe we can keep discussing it there

Ocramius commented 4 months ago

I would like to re-use a single client and inject it via constructor maybe hmm

A small class that messes with the horrors of ext/curl is OK, as long as it's tested :D