arobase-che / remark-attr

Remark plugin to add support for custom attributes
Other
58 stars 16 forks source link

Fenced code? #1

Closed Hypercubed closed 5 years ago

Hypercubed commented 6 years ago

Any reason this does not or cannot work with fenced code... for example:

```html{ style="..." }
<p>Hello</p>
```
arobase-che commented 6 years ago

Hi \o

I just didn't implement it. It a new syntax and it need a complete parser for fenced code.

However i should be able to implement that syntax :

```html
<p>Hello !</p>

{ style="..." }


Which is more consistent with the syntax of the other non-inline (block) elements.

About the first syntax, someone should talk with the creator of remark about that.
Hypercubed commented 6 years ago

Are you saying you think the fenced code syntax should be:

```html
<p>Hello !</p>
```
{ style="..." }

vs:

```html { style="..." }
<p>Hello !</p>
```

Any strong reason for one over the other?

I currently using a custom remark plugin to handle:

```html 'class="..."'
<p>Hello !</p>
```

I'd prefer to use remark-attr.

arobase-che commented 6 years ago

Are you saying you think the fenced code syntax should be:

Yes, you got my point :)

Any strong reason for one over the other?

Strong, maybe not but at least there are reasons.

With markdown, there are 2 types of elements. Block and span. This plugin add a bracket custom markdown attributes syntax which is placed at the end of the original syntax. On the same line for span element and to the next line for block element.

# Here a title
{ type="block" }

*Please, chance it.*{ author="malcolm" type="span" }

Another reason is that i don't know if the syntax :

~~~lang { attr="val" }

Is in use. I know that Pandoc use something similar, but i far as i know it is something like that :

```{ lang="c" attr="val"}

Pandoc is also compatible with :

~~~lang

But can we mix it ? I don't know.

And finally, they is that info string syntax.

~~~lang attr=val attr2=val

info string is used in Github Flavored Markdown and CommonMark so i think it's a common syntax. However, remark doesn't support it at the moment. Maybe i should make a PR about it.

Mixing info string syntax with the Pandoc one will look ugly as :

```lang attr=val attr2='val' { attr='otherVal' attr3=yes }

And I preferred not to.

That's my point. I'm interested in yours. :)

Edit: Bad english :woman_facepalming:

Hypercubed commented 6 years ago

Hmm... a couple of comments:

I would prefer the attrs be after the language only for readability.

~~~lang { attr="val" }

The pandoc syntax is not very useful since it will interfere with syntax editors.

It looks like the info string in GFM is pretty much only used for the language.

arobase-che commented 6 years ago

Hummm ...

Ok. I put it on my todo-list. I don't guaranty anything, but i will try something.

Hypercubed commented 6 years ago

I'm willing and able to contribute... as long as we have a good idea of the syntax.

arobase-che commented 6 years ago

:+1:

The syntax ?

~~~ lang { attr="val" }

The one you talked about.

Where '{ attr="val" }' is a info string, that's the trick, the parsing of the info string can be made by md-attr-parser. As it's coded, it will handle both 'attr="val"' and '{ attr="val"}', which is pretty cool. So info string should be support by remark first. I made a issue about that : remarkjs/remark#342, it's easy to implement by it had to be merged first.

So both of the syntax will be supported by the plugin :

~~~haskell { start_line=2 }
f x = 2*x
~~~

and

~~~haskell start_line=2
f x = 2*x
~~~

And a fallback support of the suffixed custom attributes for block. Kind of like kramdown does. It's a short commit to that plugin.

The hardest thing will be the configuration.

What do you think ?

Hypercubed commented 6 years ago

I'm really not sure what support is needed from remark itself. Currently for fenced code like ~~~haskell { start_line=2 } the intire string after the tripple tick is interpreted as the language (so here node.lang = "haskell { start_line=2 }". I belive we would need nothing more than to extract the attr portion from the language string (aka info string). Am I oversimplifying this? I'm sure you have been working with remark for a lot longer than me.

arobase-che commented 6 years ago

Hahah, that's true. But it look like a bug to me.

In fact, now, fenced code like ~~~haskell { start_line=2 } will break others plugins like remark-highlight.js. So we can't exploit the fact that remark missparse lang. But as we can see, implement a info string support to remark will be easy.

Hypercubed commented 6 years ago

I could be wrong but appears to me that in this example ~~~haskell { start_line=2 } the info string is "haskell { start_line=2 }" and remark is placing the entire info string in lang property (maybe poorly named in that case). So the bug is in remark-highlight.js that doesn't correctly parse the info string.

From the GFM spec:

The first word of the info string is typically used to specify the language of the code sample, and rendered in the class attribute of the code tag.

arobase-che commented 6 years ago

Oh !

You have definitively right ! ... By the end of the WE ... :wink:

Thanks :)

Hypercubed commented 6 years ago

Looking at the CommonMark spec and the CommonMark demo also leads me to believe that the entire string is the info string:

https://spec.commonmark.org/dingus/?text=%60%60%60this%20is%20all%20info%20string%0Asdfasdf%0A%60%60%60

Notice that the generated AST is <code_block info="this is all info string">

arobase-che commented 5 years ago

Humm, now they is some new tests about this feature. Also, the upcoming version of remark is already supported. Only the syntax :

~~~lang { attr="val" }

is supported. Not this one :

~~~{ land attr="val" }
~~~

Nore this one :

{lang attr="qsdqsd"}

The reason to this is to not break compatibility with others markdown engine that only support the lang attribute.