gettalong / kramdown

kramdown is a fast, pure Ruby Markdown superset converter, using a strict syntax definition and supporting several common extensions.
http://kramdown.gettalong.org
Other
1.72k stars 274 forks source link

figure element #48

Closed iNerdier closed 5 years ago

iNerdier commented 11 years ago

I’ve been trying to write lately with kramdown's markdown parser and I’ve come across two problems that seem to unsolvable in the current release.

One is that there seems to be no way to stop kramdown from wrapping image links in

tags. This is often fine but when paired with something like

p + p {
    text-indent: 1em;
}

the results are less than happy, images inherit unwanted text stylings and go all over the shop.

Markdown is smart enough not to add tags to existing block level elements, could the same method be used to allow specifying of stand-alone images?

If this is unfeasible or undesirable, could it be possible to allow it via an option ( say :image_block => true )

The second problem is attempting to use html5's new

and
elements. According to can I use it all major browsers now support them and they’re certainly a great help in writing articles with images that need citations or have direct references to text.

Looking through google to see if anyone has attempted it I found https://github.com/michelf/php-markdown/wiki/HTML5-update which seems to have a pretty useful starting point for working out a syntax to use them.

Would either of these problems be something we might be able to do something about?

gettalong commented 11 years ago

You can already avoid paragraph tags for images like this:

this is 

{::nomarkdown}
<img test="test" />
{:/}

this is

results in

<p>this is </p>

<img test="test" />

<p>this is</p>

And you can already use <figure> and <figcaption>:

<figure markdown="1">
<figcaption>
test
</figcaption>
![test](img.jpg)
</figure>

gives you

<figure>
  <figcaption>
test
</figcaption>
  <p><img src="img.jpg" alt="test" /></p>
</figure>

I have currently no intention to provide extra syntax HTML5 specific tags like <figure>, <article>, ...

AvverbioPronome commented 7 years ago

I'm probably late to this, but I like a lot how Pandoc handles this: if an image is in a paragraph on its own, it becomes a figure, and its alt-text is used as caption.

RoyiAvital commented 6 years ago

Any chance having syntax for Figure?

Something like using the alt-text as caption as suggested by @9peppe ?

Thank You.

gettalong commented 6 years ago

I can imagine providing a new option for changing the behaviour of how paragraphs with only an image are converted, similar to how the LaTeX converter currently does it - see https://github.com/gettalong/kramdown/blob/master/lib/kramdown/converter/latex.rb#L71

@9peppe / @RoyiAvital What should the output for the following kramdown fragment should be if the option to convert standalone images is activated?

This is a para with an ![image](some.jpg).

![standalone image](some.jpg)

Note that the alternative text of the image cannot contain markup, it is just text.

AvverbioPronome commented 6 years ago

I'd say

<p>This is a para with an <img alt="image" src="some.jpg" /></p>

<figure id="someid">
  <img alt="standalone image" src="some.jpg" />
  <figcaption>standalone image</figcaption>
</figure>
RoyiAvital commented 6 years ago

3 Remarks (Though I'm not sure they are doable):

  1. Could we have a flag of some kind to whether use Figure or regular Image element?
    Maybe something like ![standalone image figure="true"](some.jpg).
  2. Currently kramdown allows adding HTML using ![standalone image](some.jpg){:class="center-img"}. It should be kept.
  3. In Pandoc they created reference system using \tag. Is there an option doing so here as well?
gettalong commented 6 years ago

@9peppe If the image had an ID assigned, should it then be on the figure tag or on the img tag?

@RoyiAvital

ad 1) No. If the new option to process a paragraph with a single item being an image is activated, the output is changed. The only way this would be doable would be via the option value itself.

ad 2) The parser is not touched, only the output is changed.

ad 3) I don't know what you mean. If you set an id on the image, you can use that id later.

AvverbioPronome commented 6 years ago

I don't know. I think it will depend on individual use cases. But if we apply it to figure, then we can use css selectors like #id img.

gettalong commented 6 years ago

That's what I was thinking, i.e. class and id attributes are transferred to the figure tag, all others stay with the image.

RoyiAvital commented 6 years ago

@gettalong , First thank you for the answer and the openness. Could you explain:

ad 1) No. If the new option to process a paragraph with a single item being an image is activated, the output is changed. The only way this would be doable would be via the option value itself.

Thank You.

gettalong commented 6 years ago

@RoyiAvital I'm against changing the kramdown parser. Therefore the way to specify how a paragraph with only an image should be rendered is via the new option that changes how the converter works. E.g. if the option's value is "figure", then a figure tag is rendered, if it is "image" than just the image without enclosing paragraph tags is rendered.

RoyiAvital commented 6 years ago

I see. How could one which uses GitHub Pages could control this?

gettalong commented 6 years ago

I don't really know Github Pages nor use it, but since it is based on Jekyll I guess it provides the ability to define kramdown options and through this facility one could also set the new option.

mb21 commented 6 years ago

In pandoc, when its implicit figure extension is enabled, you can append a newline to mark individual images as non-figures.

About where to put the class and id, there's an issue open about that.

gettalong commented 6 years ago

So, to sum up:

![standalone image](some.jpg){:#id .class}

will be rendered into

<figure id="id" class="class">
  <img alt="standalone image" src="some.jpg" />
  <figcaption>standalone image</figcaption>
</figure>

iff the new option 'standalone_image' is true. If it is not (the default value for the option), then the current behaviour doesn't change.

@9peppe @RoyiAvital Any comments?

RoyiAvital commented 6 years ago

@gettalong , It looks great!

2 edge cases I thought about What happens when:

![standalone image](some.jpg)

Could we have:

<figure>
  <img alt="standalone image" src="some.jpg" />
  <figcaption>standalone image</figcaption>
</figure>

Or even farther, for:

![standalone image](some.jpg){:#id}

Having:

<figure id="id">
  <img alt="standalone image" src="some.jpg" />
  <figcaption>standalone image</figcaption>
</figure>
RoyiAvital commented 6 years ago

@gettalong , Any update on that?

gettalong commented 6 years ago

I'm rather busy right now, so this change together with a new release will probably be in August.

gettalong commented 5 years ago

@RoyiAvital I have implemented this now with a slight change of functionality: If you want to have a standalone image, use the special standalone IAL reference on the image, ie.

![standalone image](some.jpg){:standalone}
Faegy commented 5 years ago

@gettalong Any commit to reference in order to track this feature's availability?

gettalong commented 5 years ago

Will be pushed shortly!

AvverbioPronome commented 5 years ago

use the special standalone IAL reference on the image, ie.

I don't understand this. Why are we introducing a behavior not seen in any markdown parser? (a big part of my initial request had to do with using the same source with multiple parsers, even if I did not make that clear)

gettalong commented 5 years ago

To make it possible to have both worlds. This new functionality is backwards compatible, so no problem. And if you use another library that supports kramdown, it will also just work, using the default output of that library.

AvverbioPronome commented 5 years ago

Yeah, but it does not make sense. img is an inline element, not a block level one, a standalone img should be treated as a block level element, thus I think it should either be in p or figure, depending on the chosen doctype. We well never have both worlds in the same document.

gettalong commented 5 years ago

Actually, <img /> is an inline and a block tag. So in Markdown one has to choose and in case of kramdown, <img /> is treated as an inline tag. This means that internally an image is always wrapped inside a paragraph.

Now, with these changes one can choose how a standalone image is shown: either as a paragraph with an image (default) or as a figure element. This has nothing to do with a doctype.

AvverbioPronome commented 5 years ago

Actually, img was an inline element, now it's replaced content and shown inline by default. But I don't want to fight over this.

Doctype matters because there is no figure element before html5. So the best option would imho be <p><img /></p> if the parser knows it's producing html4 (and by default, too, it's ok), but <figure><img /></figure> if the parser knows it's producing html5. Better?

gettalong commented 5 years ago

The doctype is not known by kramdown. It just outputs valid HTML. With the current implementation, the user can choose how to display standalone images regardless of the doctype, on a case-by-case basis.

mb21 commented 5 years ago

So either you use the global standalone_image option, or you can use {:standalone} on a image by image basis, correct? If so, that seems sensible to me (although I would be mostly interested in the global option, and I might have called it something figure).

gettalong commented 5 years ago

There is no global standalone_image option, currently. As I mentioned before I decided to go with the IAL reference. If a global option is still needed, this would have to wait for another release.

pmpinto commented 5 years ago

@RoyiAvital I have implemented this now with a slight change of functionality: If you want to have a standalone image, use the special standalone IAL reference on the image, ie.

![standalone image](some.jpg){:standalone}

@gettalong Was this ever released? I'm on v1.14 and I'm still seeing the img inside a p with the following:

![placeholder image](https://via.placeholder.com/1920x1080){:standalone}
gettalong commented 5 years ago

@pmpinto See https://kramdown.gettalong.org/converter/html.html#standalone-images and the release notes for 2.0.0

pmpinto commented 5 years ago

@pmpinto See https://kramdown.gettalong.org/converter/html.html#standalone-images and the release notes for 2.0.0

For some reason I was on Jekyll 3.8.6. 4.0.0 comes with kramdown 2.1.0 already. Thanks!

RoyiAvital commented 5 years ago

@pmpinto , I think you can also do something similar using the include trick in Jekyll.

This is a temporary solution.

For instance, GitHub Pages still uses Jekyll 3.85 and kramdown 1.17.

codingthat commented 4 years ago

A more flexible solution would be better, in my opinion. When alt and figcaption are equal, this means people who actually need the accessibility feature of alt have to pay the penalty of sifting through (or hear, as the case may be) the exact same text twice. As far as I'm aware, alt should describe the picture in detail for those who cannot see it, whereas a caption is more of an additional commentary for everyone, whether they can see it or not—for example, it might cite the source the image was borrowed from. So they should be separately settable.

AvverbioPronome commented 4 years ago

When alt and figcaption are equal, this means people who actually need the accessibility feature of alt have to pay the penalty of sifting through (or hear, as the case may be) the exact same text twice.

True, but flexibility is not the point of markdown, human-readability is.

I guess the way to address your concern would be to just leave alt empty when figcaption is identical. (nb: 'empty' doesn't mean "remove the attribute," that would make invalid html).

For flexibility, there's always raw html :/

codingthat commented 4 years ago

Well...sort of. It addresses the double-content issue. But actually both sets of content have different purposes, and this is would simply eliminate the possibility of people who can't see the graphic (whether due to vision issues or a poor Internet connection) from understanding what it shows. (And if you include such descriptions in captions, well, that becomes its own double-content issue. A picture's worth a thousand words sometimes, and people who can see the image won't want those specific thousand words.)

Indeed, we currently use raw HTML in this case, which certainly makes it less readable. I wonder if there could be a special attribute in this case, to allow for separate setting of alt and figcaption that still leaves it useful for the consumers of the rendered result, while maintaining Kramdown's additional human-readability.

AvverbioPronome commented 4 years ago

Also realize that alt is not longdesc (that's not implemented in any browser at all)

mrshannonyoung commented 4 years ago

Well...sort of. It addresses the double-content issue. But actually both sets of content have different purposes, and this is would simply eliminate the possibility of people who can't see the graphic (whether due to vision issues or a poor Internet connection) from understanding what it shows. (And if you include such descriptions in captions, well, that becomes its own double-content issue. A picture's worth a thousand words sometimes, and people who can see the image won't want those specific thousand words.)

Indeed, we currently use raw HTML in this case, which certainly makes it less readable. I wonder if there could be a special attribute in this case, to allow for separate setting of alt and figcaption that still leaves it useful for the consumers of the rendered result, while maintaining Kramdown's additional human-readability.

This is the way forward. There does need to be a clear distinction between the alternate text for the image and figcaption if they are different for accessibilty reasons.

This is a quote from a blind user which goes against the current behaviour:

Copying the text of the figcaption into the alt attribute, or any shortened version, is almost always useless: the screen reader will read twice the same or almost the same information, and it's worth absolutely nothing https://stackoverflow.com/a/58468470/3337722

And as @AvverbioPronome said:

True, but flexibility is not the point of markdown, human-readability is.

If the point of human readability extends to its appearance on a webpage, then this issue I think needs looking at again @gettalong.