cran-task-views / ctv

CRAN Task View Initiative
82 stars 13 forks source link

Include images in CTVs #29

Closed basille closed 2 years ago

basille commented 2 years ago

This is more of a reminder than a request per se. It would be good if the CTV architecture could properly handle the inclusion of images in the Markdown file. For instance, in the Tracking CTV, we have a figure that sets out the framework of the CTV, and it is currently included as a base64 string that is not very convenient. Handling PNGs hosted in the same repository as the CTV would be much easier.

Something to keep in mind related to this: This should also handle alt-text properly when added.

zeileis commented 2 years ago

I agree that generating the Base 64 string is somewhat inconvenient but working with the task view afterwards should not be negatively affected by it. Or why does it bother you?

The advantage of the Base 64 string is that the .md file is completely self-contained. Having said that, it is now also possible to host things in the central repository.

Regarding the alt text, I don't know what the problem is. This can be easily included in the markdown file (in the square brackets).

basille commented 2 years ago

Perfect for the alt text indeed! I should have been more careful in looking at it — no issue there.

As for the base64 string, for one, that's an unnecessary step when working on the image, and that makes it hard to track which version was really converted. It really lacks transparency. That also makes it very inconvenient to work with a line of 170k characters in the Markdown file. At least Emacs hates it, and makes it very uncomfortable to edit the Markdown file altogether (becomes very sluggish when the cursor is near the base64 string). I hear you about the self-contained file, but to me that should not affect the user (us). Is portability/sharebility an issue at all in the CTV workflow? (on a side note, GitHub is not able to read/show the base64 image, and displays the alt text only, which is another reason to use PNGs here)

But it's good to hear that we can host things in the repo. How would that work in practice? A simple link to a PNG (for example) with relative paths from the root of the repository?

zeileis commented 2 years ago

Portability is an issue because the .md files are converted to .html in a different place than the repository and we need to make sure that they can be accessed from everywhere. I'm not sure, for example, how easy it is in, say, China to view images from GitHub. So it would be preferable to have the images directly in every task view.

The rendering on GitHub is a non-issue in my opinion, because the rendered version on CRAN is easily available (and can be generated locally for testing as well).

The argument I can understand is that editing is not so convenient. But that will depend a lot on the editing tool you use. Possibly you could include the code for generating the graphic in the .md file?

basille commented 2 years ago

Portability is an issue because the .md files are converted to .html in a different place than the repository and we need to make sure that they can be accessed from everywhere.

Maybe that's what missing to me: I fail to see the whole CTV workflow, so it is hard for me to assess what's possible, complicated or simply impossible. This said, if you can access the .md in a repo, you should be able to access the image file in the same repo, right?

The rendering on GitHub is a non-issue in my opinion, because the rendered version on CRAN is easily available (and can be generated locally for testing as well).

I disagree here, but that was actually not an argument, just a side note.

The argument I can understand is that editing is not so convenient. But that will depend a lot on the editing tool you use.

I checked with RStudio, which is not so happy with it either. I can guess that a lot of users are using RStudio and will find it inconvenient too.

Possibly you could include the code for generating the graphic in the .md file?

Not sure that would work here: This is an SVG diagram made with Inkscape, that can possibly be exported as a PNG. Or maybe you're saying we could include the whole SVG within the Markdown file? That would be cumbersome too. Generally speaking, I know we're a bit of an exception with a figure in the Tracking CTV, but I think CTVs in general would benefit to be more visually attractive (graphic summary, workflow, etc.) and include any kind of images.

Back to what you said earlier, could you clarify what you meant with:

Having said that, it is now also possible to host things in the central repository.

I'm still trying to grab how things work behind the scene so I know better what can be done.

zeileis commented 2 years ago

I think the simplest solution for you is to create the Base 64 encoding on the fly - as opposed to manually hard-coding it. You can do this by embedding the SVG image as follows:

```{r, include = FALSE}
tdir <- tempfile()
dir.create(tdir)
svg <- file.path(tdir, "workflow.svg")
download.file("https://raw.githubusercontent.com/cran-task-views/Tracking/main/img/workflow.svg", svg, quiet = TRUE)
svg <- xfun::base64_uri(svg)
unlink(tdir)
```

![Workflow diagram](`r svg`){width="500"}
Workflow

This downloads the workflow.svg file from your task view repository into a temporary directory, converts it to Base 64, cleans up the temporary directory, and embeds the Base 64 into the markdown.

In the future, we could also add a function in ctv that does exactly the tasks above. Ideally this should be enhanced, though, by adding a check that image source is trusted/stable in some way. But given that your view is the only one with a graphic for now, I think it should be tolerable to include the code in the .Rmd.

basille commented 2 years ago

That's perfect, thanks @zeileis! This is a lot easier to handle and edit, and we don't have to worry about the version of the figure anymore.

Last thing that I don't quite understand is how regular text in the same paragraph as the image and alt text are handled. For instance, with simple alt text, e.g.:

![Alt text](`r svg`){width="500"}

The result is a <figure> element with a figcaption tag in it:

<figure><img src="data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0iMS4…" alt="Alt text" width="500">
    <figcaption aria-hidden="true">Alt text</figcaption>
</figure>

Capture d’écran de 2022-06-09 10-47-13

The result is the figure with the alt text displayed just below as the caption — but that does not seem to be the purpose of the alt text, which is for cases when the figure cannot be displayed or read.

When attaching some text in the same paragraph, e.g.:

![Alt text](`r svg`){width="500"}
Regular text

The result is a <p> element with no figure or figcaption tag this time:

<p><img src="data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0iMS4…" alt="Alt text" width="500"> Regular text</p>

Capture d’écran de 2022-06-09 10-46-48

And the text is attached next to the image.

While the later seems OK to me (semantically), would it be possible to not use the alt text for a caption? Maybe a solution (although not semantically correct either) would be to hijack the title tag when it is explicitly provided, such as:

![Alt text](`r svg` "Title that can be used as a caption"){width="500"}
zeileis commented 2 years ago

See the pandoc documentation for the ways to handle this. The ctv package doesn't add any functionality here, it just calls the standard HTML5 conversion from pandoc. One trick that is sometimes used is to put a backslash \ in the line preceeding the image. Then it is embedded as inline without the <figure>.

basille commented 2 years ago

Great, the \ trick works perfectly! Thanks!

Of course it makes sense that ctv does not handle that directly… For the record, this comes indeed from pandoc, and specifically from the implicit_figures extension, which turns the alt text into the figure caption. Such a broken approach of alt text IMHO (I mean, that is NOT what alt text is for!)… Glad that it can be disabled, even if it is a trick.

zeileis commented 2 years ago

👍