Closed samatcolumn closed 1 year ago
What you're looking for is for
Sorry, I see now that we have an embedded resource in either case; the question is just what syntax is used.--embed-resources
to work with ICML.
Can you say more about what difference it makes? Why is a Contents element preferable to a Link with a data URI? Is it preferable in all cases? If so, we could just make that the default.
@jgm ICML / IDML are such arcane formats I can't say it's preferable in all cases but for our use case this is required in order to get Indesign Server to actually render the image when we load the ICML and then export PDF / JPG / etc.
Sorry I wish that was more helpful! I can do some more digging through the ICML / IDML docs and see if I can understand more.
Following up here! From my perspective using Contents
is preferable to using Links as Content renders immediately in indesign and the Link elements do not.
Here is documentation from Adobe that walks through some of the reasons to use linked vs embedded content. The primary benefit of using Links is to decrease overall file size by using URIs as opposed to base64 content. But if the Link is already including the base64 image, the benefit is lost.
OK, I'll just change the default behavior, then.
So we're dropping information that the file is image/png
; how does it know? Does it determine this from the contents of the file? (I know that's possible -- just wondering.) What file types can go in Contents in this way?
@jgm I just did a test with Indesign, I created two blank canvases and pasted a png
into one and a jpg
into another. Neither had a mime type in the <Image>
tag or any other file type information.
@jgm wow thanks for closing this so fast! You are a legend.
@jgm do you have any plans for a release soon (2.19.3
or 2.20.0
) which would include this feature or should I just keep building from source?
It's going to be 3.0 when it happens. Still waiting on some changes to the Lua subsystem, but we are getting pretty close. For now you can build from source or use a nightly.
@jgm ah a nightly sounds like what I need! I'm likely just being dense but I wasn't able to find nightly releases in this repo or on the pandoc site ... Google points me to https://github.com/pandoc-extras/pandoc-nightly but that seems to have stopped in mid-2020. Would you mind pointing me to the best place to grab a nightly .exe
?
I'll leave a note to say that, in general, linking images is the preferred solution in InDesign. Embedded resources make the InDesign file too heavy, filling the RAM, the virtual memory, and causing crashes. This is particularly evident with longer documents, but InDesign is mostly meant for long documents.
@ptram agreed! The hard part for me was understanding how to make a portable ICML file with links to images ... seems like it'd need to be a zip or something?
Do you know of a way?
Do you know of a way?
Yes, you have the page layout file (ICML, IDML or INDD), and a folder containing the images (the one InDesign calls Links when "packaging for print"). You can then zip them and deliver the zip containing the page layout and the linked resources.
If there are different needs, I would suggest to have two separate ICML writers – one for linked images, the other one for embedded images (what does @jgm think about this?). In any case, the standard use in InDesign is with linked images (as it happens with markdown). Embedded is considered a beginner's sin.
Apart for the size considerations found above, linked images mean that collaborators working on the images can change them from draft to final, and the page layout document will be able to automatically update them. This is essential in production.
If you break a link when going from Markdown to ICML, the link is gone forever. And rebuilding links on long documents is a nightmare.
This is clearly explained by Adobe themselves (as linked above).
In general, I think that the idea behind using markdown (make the main document simple, and link everything) is a winning one. ICML is nothing more than a glorified ancestor of markdown, made trying to make a bridge between PostScript and XML.
Paolo
I will go a bit more into the details about the InDesign workflow, just to be sure @jgm gets my point of view as clearly as possible. I will try to demonstrate that images should be normally linked in the ICML writer, and not embedded. Or, at least, there should be an option to make them linked.
InDesign (and this applies as well to QuarkXPress, Affinity Publisher or Viva Publisher) is a page layout program intended for creating brochures, leaflets, books, magazines, and anything that blends text and images and is intended for print or virtual print (PDF). There are people using it for writing novels or theses, but this is not the intended use and the best tool for it.
A typical workflow, in InDesign, is to have writers and visual artists produce their content, be it a narration, instructions, a series of illustrations, photos, software screenshots. The original content is usually created with dedicated tools, for example word processors, photo editor, CAD programs. These contributions are then assembled into an InDesign document. Several InDesign documents can be assembled into a Book, so that they get a common set of styles and formats.
In this workflow, images are linked. The writer and the page layout artist put a placeholder in the early version of the text or page layout document. The placeholder may be an early version of the illustration or screenshot, or a dummy photo with a size similar to the final one. They give the dummy placeholder a name and file path that is the same as the final image.
In something like VS Code, dragging an image file into a text document automatically adds its path. If you want, you can also see a preview. Word processors vary, but some of them can also link images and only show a preview of them. As far as I know, they usually convert complex external files (like TIFF or PDF) into simpler bitmap data, flattening layers and effects.
When the final image is ready, it is replaced, in the linked images folder, to the dummy placeholder. InDesign automatically updates it, by reading the saved file path contained in the page layout document, with the text and page layout remaining the same as they were with the temporary dummy image.
This couldn't happen with embedded images, that InDesign wouldn't know where to find outside of its document. Updating would mean linking or importing it again.
In my view, if you use Pandoc, you are after a markdown workflow; otherwise, you could simply import an RTF or DOCX file, including the embedded images it contains. A markdown workflow is essentially based on a separation between elements – text, external resources, style appearance. This is to facilitate reuse and confluence of contributions.
For this reason, I think the ICML writer should allow, or even privilege, linked images. Without linked images, images would have to be reimported after loading the ICML file into InDesign. This would have to be repeated each time the ICML document is generated again.
Paolo
I'd be fine to change to use linked images, unless people have objections we haven't considered. Can you give an example of how the emitted code should change?
Can you give an example of how the emitted code should change?
Dear John, thank you very much for your willingness to implement this change. I'm attaching the original InDesign (INDD), the exported full document (IDML), and the exported snippet (ICML) from a very simple document, only containing a frame, a paragraph, and a linked image. I'm also attaching the linked image.
I guess the interesting code is this one, where text and the rectangle containing the linked image are declared:
`
I guess the most useful thing would be a diff between pandoc's current output and the output you desire. I know nothing about the format so I'd need very precise instructions.
I guess the most useful thing would be a diff between pandoc's current output and the output you desire. I know nothing about the format so I'd need very precise instructions.
Oh my, I'm the one who knows nothing about the needed code. What I can say is that I made myself sure I had the latest version of Pandoc (3.1.6.1), and tried to reconvert an MD file generated by Scrivener.
The result is what I desired, but didn't expect: the linked images were actually linked (image from Affinity Publisher):
The ICML code generated by Pandoc indeed includes the file path:
`
<Link Self="ueb" LinkResourceURI="file:/Users/[...]/Documents/Markdown/mmd-from-scriv.md/images/CC3.png" />
</Image>
`
The odd thing is that after importing the ICML file into InDesign CS6, the Links pane doesn't allow for relinking the images. The file path is showing the path, but the Relink command is not available. Maybe that this is caused by the $ID/Embedded
instruction included in the generated ICML file. It looks like a linked image has the file type (PNG, JPEG…) instead of the "Embedded" declaration.
After exporting the InDesign file to an IDML file, Affinity Publisher can see the images as linked ones, and allows for relinking. The further export seems to fix the issue.
This discussion seems related to this issue, so maybe you can gather some useful info from it? There is a snipped of code that might contain an example similar to the one we are after:
How to extract image from IDML file using IDMLlib
I must say that at the moment, for reasons that I'm not able to explain, the ICML writer is doing what I think it should do – preserving the links to the original image files.
Paolo
(While doing my tests, I found that the writer doesn't allow relative paths, like "/images/image.png", but it requires the full absolute path. This can probably be worked around in InDesign/Publisher, but it may be an inconvenience. Should I open another issue to report this?).
/images/image.png
is an absolute path. Maybe you want ./images/image.png
or images/image.png
?
/images/image.png
is an absolute path. Maybe you want./images/image.png
orimages/image.png
?
Sorry, you are perfectly right. Yes, I would like something like ./images/image.png
or images/image.png
, relative to the position of the MD file to be converted. Apparently, at the moment they are not recognized by the writer (or by Pandoc as a whole?).
Relative paths are interpreted relative to the working directory (which might be different from the directory containing the markdown file). But see the documentation on the rebase_relative_paths
extension.
Relative paths are interpreted relative to the working directory
I've read a bit about this issue, and I understand it's a complex task that I have to study more in depth.
As for the $ID/Embedded
declaration replaced with the file format, it didn't work in my tests. I tried with $ID/PNG
without changing anything else, and there is no difference in how InDesign CS6 behaved.
The odd thing is that even the exported IDML file still results in images that can't be relinked in InDesign CS6, while can be relinked in Affinity Publisher 2. Probably, the age difference shows.
Just to be sure: the official reference to IDML/ICML is still available via GitHub:
It is very interesting how InDesign looks for the location of the linked files:
InDesign first looks for the link in the folder containing the IDML file. If the file cannot be located in the folder, InDesign searches for the file by using the file path to the IDML document. If the file is still not found, InDesign goes up one level in the IDML document’s path, and tries again. Finally, InDesign looks for the link in the folders that have been specified by the user when updating file links during the current InDesign session.
Are there other Pandoc export targets where the file has "links" to other local files? The reason we preferred embedded images was because we send the ICML to a server and wanted an "all in one" format. If we move to linked images (which does sound nice) I'd like to understand if there's a standard I should be using to re-assemble the required file system on the other side.
Also since my use case is HTML -> ICML I'll say that I think there are three distinct things to care about:
<img src='data:image/jpeg;base64...>
- in this case the image is embedded in the HTML. To me, embedding in the ICML makes sense and preserves the intention.<img src='./example.jpg'>
- here the image references a relative file path. Doing the same via a link in ICML makes sense!<img src='https://example.com/foo.jpg'>
- not sure what makes sense here, since I don't think it's Pandoc's job to download images.I assume that the three cases above have already been well handled by Pandoc for converting HTML -> Any other format that supports images. I'm probably showing my Pandoc ignorance!
Are there other Pandoc export targets where the file has "links" to other local files?
I'm not a Pandoc erudite, but I would believe that the following ones are formats that are written with images as links to separate resources:
Markdown, HTML, ePub, LaTeX, OPML, MediaWiki markup.
Paolo
I've just made a test with a big file, compiled by Scrivener to MMD and converted by Pandoc to ICML. Links to the original image files are preserved.
Paolo
Describe your proposed improvement and the problem it solves.
Currently pandoc uses
<Link>
to embed images in ICML files: https://github.com/jgm/pandoc/blob/415550a36a9f5cfb412a812836b835d12ec12cb8/src/Text/Pandoc/Writers/ICML.hs#L633The ICML looks like this:
However it's often desirable to embed images directly using the
<Contents>
tag, which would look like this:It would be great if there was a flag or an HTML
data-*
attribute we could set to have Pandoc choose the latter format.Describe alternatives you've considered.
Currently we are post-processing the ICML output by Pandoc to achieve this.