jgm / djot

A light markup language
https://djot.net
MIT License
1.67k stars 43 forks source link

Captions #28

Open jgm opened 2 years ago

jgm commented 2 years ago

Tables can have captions. There should be a way to attach a caption to a pipe table. But captions are more general than that: other things can also have them (code blocks, figures, maybe equations). So perhaps we should have a more generic syntax for attaching a caption? Captions can, in general, contain inline formatting, and perhaps they should be allowed to contain block formatting. (Multi-paragraph captions can be seen.) It would also be nice to provide a way to include a "short caption," which could be used in a list of figures.

matklad commented 1 year ago

AsciiDoctor allow "titles" on any kind of blocks, using .This is a title syntax:

https://docs.asciidoctor.org/asciidoc/latest/blocks/add-title/

It typically is rendered above (as a title), rather than below (as a caption)

I've found this quite convenient, though I am not sure there's proper semantic HTML for this kind of thing

chrisjsewell commented 1 year ago

Definitely interested in standardising this! I guess in terms of HTML5 we are looking to cover both:

 <table>
  <caption>My caption</caption>
  ...
</table> 
 <figure>
  ...
  <figcaption>My caption</figcaption>
</figure> 

It typically is rendered above (as a title), rather than below (as a caption)

Maybe it can't really go above, because this "space" is used for block attributes?

I could foresee something like @bpj suggests in #87 (changing how captions are encapsulated):

{#id1 .class1}
!!!
![Alt text](path/to/image.ext)
!!![This is my (optional?) caption.

That can contain multiple blocks?
]

{#id2 .class2}
| a | b |
| - | - |
| c | d |[This is my table caption.

That can contain multiple blocks?
]

To go to:

 <figure id="id1" class="class1">
    <img src="path/to/image.ext" alt="Alt text" />
    <figcaption>
        <p>This is my (optional?) caption.</p>
        <p>That can contain multiple blocks?</p>
    </figcaption>
</figure>

<table id="id2" class="class2">
    <caption>
        <p>This is my table caption.</p>
        <p>That can contain multiple blocks?</p>
    </caption>
    <tr>
        <th>a</th>
        <th>b</th>
    </tr>
    <tr>
        <td>c</td>
        <td>d</td>
    </tr>
</table> 
bpj commented 1 year ago

@chrisjsewell I don't think caption syntax should be attached to the end of (fenced) blocks like that — they should go inside the fences and be separated from the thing they caption with at least one line break — but the brackets is a good idea. Captions could start with :[ to clearly distinguish them from definition list, spans and links:

{.table}
| a | b |
|---|---|
| c | d |
:[Line 1 of caption.

Para 2 of caption.]
chrisjsewell commented 1 year ago

Captions could start with :[ to clearly distinguish them from definition list, spans and links

Sounds good 👍

I don't think caption syntax should be attached to the end of (fenced) blocks like that — they should go inside the fences and be separated from the thing they caption with at least one line break

Yeh I was just trying to think about how to make it consistent across tables and figures, which is why I put it after but indeed I guess this would be good:

{#id1 .class1}
!!!
![Alt text](path/to/image.ext)

:[Line 1 of caption.

Para 2 of caption.]
!!!
matklad commented 1 year ago

To suggest another color: captions are a bit like single-item definition list, so we can use something like

:: A mountain sunset
![](https://www.flickr.com/photos/javh/5448336655)

That is:

doc
  para
    image destination="https://www.flickr.com/photos/javh/5448336655"
      caption
        str text="A mountain sunset"

Bigger example:

:: This is a caption for the following code block

   This paragraph is indented, so it is still part of the catption

captioned code block


Subsequent paragraph, it is not part of the `<figure>`. 
You can use a wrapping `:::` (a div)  to caption more than one block
chrisjsewell commented 1 year ago

captions are a bit like single-item definition list

@matklad what if you want caption for multiple items? In particular, I'm thinking how you can create subfigures in LaTeX, which is super helpful for technical writing: https://www.overleaf.com/learn/latex/How_to_Write_a_Thesis_in_LaTeX_(Part_3)%3A_Figures%2C_Subfigures_and_Tables#Subfigures

chrisjsewell commented 1 year ago

Note something somewhat adjacent to this is admonitions with titles: https://talk.commonmark.org/t/feature-request-admonitions-in-commonmark/3619, such as in https://squidfunk.github.io/mkdocs-material/reference/admonitions/#changing-the-title:

!!! note "Phasellus posuere in sem ut cursus"

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla et euismod
    nulla. Curabitur feugiat, tortor non consequat finibus, justo purus auctor
    massa, nec semper lorem quam in massa.

It would maybe be nice to cover this in a caption syntax?

matklad commented 1 year ago

what if you want caption for multiple items?

Group them with :::. To translaet LaTeX example:

:: Three simple graphs
:::

:: $`y=x`
![][graph1]

:: $`y=3sinx`
![][graph2]

:: $`y=5/x`
![][graph3]

:::
matklad commented 1 year ago

I see that we now support ^ caption syntax. Might want to document that on djot.net?

Omikhleia commented 1 year ago

we now support ^ caption syntax

Djot generalized attributes on all elements. In the same vein of ideas, wouldn't it be interesting to generalize consistently captions on all block elements?

As in:

```
Some code that does something
```
^ Some legend for that code

Currently, captions/legends not attached to a table happen to be silently dropped.

bpj commented 1 year ago

Thinking again about this prompted by the recent discussion in #31 I wonder if a good syntax for both short and long caprions might be

```foo
Some element.

^ [This is the short (one-line) caption.] ^ This is the long caption. it may have multiple indented lines. ]


1.  A short caption comes before any long caption on its own line. I’m divided on whether it should be allowed to contain any line breaks in the source but lean towards allowing it.

2.  It is enclosed in `[...]` without any directly following `( [ {`, so that for example

    ``` djot
    ^ [foo][bar] baz.
Is a long caption containing a reference link because of the `[` following the first `]` (and the text following the second `]`).
  1. It may contain verbatim which in turn may contain unbalanced square brackets, and elements with balanced square brackets as part of their syntax but unbalanced literal square brackets would have to be backslash escaped, IOW the same rules as for link texts and span content. The short caption equivalent to the example in point 2 would be

    ^ [[foo][bar] baz.]
  2. I think allowing ignorable leading and trailing whitespace inside the square brackets might be a good idea to enhance readability:

    ^ [ [foo][bar] baz. ]
  3. The long caption if any must start on the immediately following line, with its own leading ^.

Hopefully this won’t be too hard to parse. I guess the code is already keeping track of unclosed brackets in several places.

Omikhleia commented 1 year ago

I don't get the short vs. long caption distinction and the need of brackets.

bpj commented 1 year ago

Short captions are for inclusion in ToC and other places were the full caption may be too long/you need a single line version/a different version. Pandoc’s new Figure element supports it and I think @jgm mentioned short captions here or in the other thread.

As for the brackets I was perhaps subconsciously influenced by LaTeX, where the short caption is an optional argument!😁 Certainly the short caption may be distinguished by some other means, e.g.

```foo
element

^^ Short caption on this line. ^ Long caption starts here and continues here.

Omikhleia commented 1 year ago

@bpj Thanks for the clarification. It makes sense and is an interesting take.

I do like the ^^ vs. ^ syntax proposal!

In addition, I also think we could need a provision for supporting paragraphs in (at least long) captions. Unless mistaken, this is not possible with ^ on tables currently, but a (long) caption should allow them. I had such a case in a book, where actually the first "short" paragraph of the caption would have been ok for inclusion in the ToC -- with the whole caption content only appearing in the text flow -- this is also an alternative way to look at the issue. (i.e. a solution the writer/renderer might select, if there's only a long caption made of several paragraphs)

bpj commented 1 year ago

@Omikhleia since multiline (long) captions have to be indented it may be a bug that you can't have multiple paragraphs. Speaking with my editor hat on multiline captions is bad practice, but I think it should be allowed for the 100th case! I guess the main problem as it stands is that the HTML caption element doesn't take block content.

Omikhleia commented 1 year ago

@bpj

Speaking with my editor hat on multiline captions is bad practice (...).

I dunno, maybe it's a "legend" (semantically part of the figure, though) rather than a true "caption"? For the sake of clarity, I mean this kind of things:

image

(Hastily forged, based on an original I cannot disclose, and which would have been in French anyway)

Maybe it's bad practice, but the author wants that kind of long legends in many places ^^

jgm commented 1 year ago

Even simpler: if there are two consecutive captions, the first one is used as the short caption. (Or even: the shorter one is used!) Then we don't need ^^ and ^.

On the idea of allowing captions on every block element: I don't know. Certainly, code blocks sometimes have captions. I had thought of handling this with a generic 'figure" syntax that can contain code blocks, images, and other things. But an alternative would be to say that any block element can have a caption attached, and when this happens, it turns into a figure (unless it's a table which already has a place for a caption).

Omikhleia commented 1 year ago

If it "turns" into a figure, or is wrapped in a "generic" figure -- in both cases, the writer/renderer would/could still need a way to distinguish what was intended semantically, so as to possibly process them differently e.g. code blocks would probably be captioned "Listing N" and get an entry in a "list of listings" in a ToC; as opposed to regular figures ("Figure N", list of figures) and tables ("Table N", list of tables).

bpj commented 1 year ago

Speaking with my editor hat on multiline captions is bad practice (...).

Sorry about the confusion! I meant to write "multi-paragraph" rather than "multiline". A most unfortunate misediting because it ended up saying the opposite of what I meant: multiline captions are usually regarded as normal, multi-paragraph ones aren't (although I do think that the exceptional case should be allowed for!)