marp-team / marp-cli

A CLI interface for Marp and Marpit based converters
MIT License
1.85k stars 105 forks source link

Cache issue with ?pdf #570

Closed WeLoTech closed 5 months ago

WeLoTech commented 5 months ago

PDF Generation Does Not Reflect Updated Images - Due to Caching?

We are encountering a usability issue related to image updates within previously generated slides.

Specifically, the problem arises when updating an image that is referenced in both Markdown (.md) files and generated PDF documents. While the update process works seamlessly for Markdown files, the PDF documents persistently display the cached version of the original image, failing to reflect the most recent updates.

Context: A slide includes a header section that contains an image (img). Update Process: When the original image is updated, the external server is correctly updated to point to the new image using the same file path. Issue: This update process works as expected for Markdown (.md) files, where the new image is immediately reflected. However, when generating PDF documents (?pdf), the system continues to use the cached version of the original image instead of the updated one.

Is there a way to forcefully clear the cache or ensure that the PDF generation process always uses the most recent version of an image?

Are there specific settings or steps that we can implement to automatically refresh the cached images upon updating the original files?

WeLoTech commented 5 months ago

Update: Changing the .md file name, is also not resolving the issue.The .md displays the updated img the .pdf still the old one.

yhatt commented 5 months ago

Whether images used by Markdowns are refreshed depends on the caching rules provided by both the internal browser process and the remote server serving the image.

Currently, Marp CLI cannot modify the caching rules of the internal browser spawned for making PDF. And there is no plan working for that because it may bring performance degrading and remote server overloading if caching could be disabled.

A common solution suggested in many Markdown processors is adding a query parameter to the image URL that doesn't affect the image. This makes the browser recognize the image that is provided from the new URL, and refresh the image without using a cache.

![](https://example.com/image.jpg)

# Add something useless query param to update, just like `?__update=v1`
![](https://example.com/image.jpg?__update=v1)

# Use different query param to update again
![](https://example.com/image.jpg?__update=v2)

You might write a plugin for the engine to automate this assignment.

References

yhatt commented 5 months ago

A plugin example to assign unique query for the image URL when rendering Markdown (WARNING: It may bring overload to remote server):

// engine.config.js
const crypto = require('crypto')

const imageUniqParameterPlugin = (markdownIt) => {
  const imageRuleIndex = markdownIt.inline.ruler.__find__('image')
  if (imageRuleIndex === -1) throw new Error('Parser rule not found: image')

  const originalImageRule = {
    ...markdownIt.inline.ruler.__rules__[imageRuleIndex],
  }

  // Replace existing markdown-it image rule to add unique parameter
  markdownIt.inline.ruler.at(
    'image',
    (state) => {
      const originalNormalizeLink = state.md.normalizeLink

      try {
        // Override normalizeLink to add unique parameter while parsing image
        state.md.normalizeLink = (url) => {
          try {
            const urlObj = new URL(url, 'dummy://dummy.dummy/')

            // Add unique parameter to avoid cache
            urlObj.searchParams.set('__marp_v__', crypto.randomUUID())
            let normalizedUrl = urlObj.toString()

            if (normalizedUrl.startsWith('dummy://dummy.dummy/')) {
              // Remove dummy protocol and host
              normalizedUrl = normalizedUrl.slice(20)
            }

            return originalNormalizeLink(normalizedUrl)
          } catch (e) {
            console.warn(e)

            // Fallback to original normalizeLink if URL parsing failed
            return originalNormalizeLink(url)
          }
        }

        return originalImageRule.fn(state)
      } finally {
        // Restore original normalizeLink
        state.md.normalizeLink = originalNormalizeLink
      }
    },
    { alt: originalImageRule.alt }
  )
}

module.exports = {
  engine: ({ marp }) => marp.use(imageUniqParameterPlugin),
}