jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.78k stars 3.39k forks source link

removing images from epub for typst/pdf conversion #10306

Closed jarnowic closed 1 month ago

jarnowic commented 1 month ago

Discussed in https://github.com/jgm/pandoc/discussions/10299

Originally posted by **jarnowic** October 17, 2024 Hello! How can I remove images from epubs in converting to pdf via typst? groff-ms removes images automatically, but typst tries to render them. ``` C:\Downloads>pandoc -o carroll72img.pdf -t typst carroll72img.epub error: file not found (searched at \\?\C:\Downloads\6012930945704010388_cover.jpg) ┌─ \\?\C:\Downloads\toPC185.html:138:11 │ 138 │ #box(image("6012930945704010388_cover.jpg")) │ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Error producing PDF. ``` Using `--extract-media .` in the CL includes the images, but I only need the placeholders, considering that I may be downloading content from URIs, and [a known vulnerability](https://github.com/jgm/pandoc/security/advisories/GHSA-xj5q-fv23-575g) . There might be a Lua filter to remove image references, but I can't find it. carroll72img.epub was downloaded from `https://gutenberg.org/ebooks/12` and renamed to make it intelligible.
jgm commented 1 month ago

This should do it.

function Image(el)
  return el.caption
end

This will substitute the image description for the image.

jarnowic commented 1 month ago

This should do it.

function Image(el)
  return el.caption
end

This will substitute the image description for the image.

as simple as a lua function... thanks!

jgm commented 1 month ago

PS. Please do not put questions in issues. These go in Discussions. You already posted there, I see, but don't abuse issues to ask a question just because you haven't gotten a response yet.

jarnowic commented 1 month ago

my apologies: this interface is somewhat unfamiliar to me; it's not much unlike people mistaking comments for answers on stackexchange on their first approach. with the demise of google access to pandoc discuss the clear difference between bug report, troubleshooting and general inquiry is blurring.

jgm commented 1 month ago

No worries. But you can see now what the problem is -- someone just replied to your Discussion entry with essentially the same answer I gave above. Having the same question in two places can cause duplicated work.