syntax-tree / mdast

Markdown Abstract Syntax Tree format
https://unifiedjs.com
1.1k stars 45 forks source link

Add `info` field to `Code` block #22

Closed arobase-che closed 6 years ago

arobase-che commented 6 years ago

Hi :)

It's just an update of the Code's description. The main modification is the info string tag.

Double check my english please. :pray:

:two_hearts:

arobase-che commented 6 years ago

:)

No reason, it's as you wish

@Hypercubed Yeah I was thinking something along those lines, a code.infoString being the whole thing, lang being the first “word”? It’s breaking, but more fixing.

So i take infoString, in first place i taked "info-string" and it's definitively ugly.

Hypercubed commented 6 years ago

I would think that since the common mark spec refers to it as the info string the property should be infoString... but it doesn't bother me either way.

arobase-che commented 6 years ago

For the language, it's "lang" so for the info string, the tag "info" doesn't seen weird.

For the request, it's done (for the attribute's name too btw)

wooorm commented 6 years ago

Yeah I think I prefer info over infoString. Feels a bit weird to add the string prefix, whether commonmark does it or not!

wooorm commented 6 years ago

Say we have this:

```js no-highlight
fn()

And we have the node for that code:

```js
{
  type: 'code',
  info: 'js no-highlight',
  lang: 'js',
  value: 'fn()'
}

...what if someone changed node.lang = 'c' or something else?

arobase-che commented 6 years ago

I think that info should be the whole string. The main reason is that the definition of info string from GMF and CommonMark is the whole string. An other is that it's common to specify the language as the first word but it's not specified.

Spec :

The first word of the info string is typically used to specify the language

But as you notify, it can lead to inconsistent states.

PS: Oups ! I miss clicked on close and comment ... >_<"

wooorm commented 6 years ago

I’d rather not specify something that could create inconsistencies, and IMO it makes sense to see:

~~~ ruby startline=3 $%@#$
~~~

to

{
  type: 'code',
  lang: 'ruby',
  info: 'startline=3 $%@#$'
}

...note, btw, I don’t think commonmark specifies what a “word” is in:

The first word of the info string is typically used to specify the language of the code sample, and rendered in the class attribute of the code tag. However, this spec does not mandate any particular treatment of the info string.

arobase-che commented 6 years ago

:( Ok

So it's more complicated. Now it need specifications :cry:

So ... here we go !

~~~js info

`lang='js' info='info'`

~~~~md
~~~js      info    
Trailing spaces before and after the info string are skipped

`lang='js' info='info'`

~~~~md
~~~js
No info string is null


`lang='js' info=null`

Is it ok like that ?

Edit: Easy spec, i don't want it to be too complicated. Can't handle complicated things.
wooorm commented 6 years ago

Yeah I think this is fine! I don’t believe we should care about spacing between lang and info. I believe those could be stripped away. We may need to work out what a “word” means though. What do you think?

@ChristianMurphy Any thoughts?

ChristianMurphy commented 6 years ago
~~~js      info    
Trailing spaces before and after the info string are skipped
~~~

So it would trim external space but not internal? E.G.

~~~ js    something    that  has      random  spacing
~~~

would parse to

{
  "info": "something    that  has      random  spacing"
}

rather than

{
  "info": "something that has random spacing"
}

?

lang='js' info=null Is it ok like that?

Sounds good

arobase-che commented 6 years ago

Actually here, we don't really care about words but the most important is that it can handle programming language names.

That to Wikipedia, programming languages list.

In the code i defined it by every char until a white space (\s) or a curly bracket (but we may add every kind of bracket).

Yeap, it will trim extrernal but not internal spaces. Or maybe no trim at all ? The choice is yours.

wooorm commented 6 years ago

@ChristianMurphy Yup, that’s what I’m thinking of! @arobase-che I feel like white-space should indeed be the border!

arobase-che commented 6 years ago

@wooorm ? Border ? An example please ? Only white-space you mean ?

Triming internal spaces can be tricky.

~~~js value="   i    want   it    spaced    "
wooorm commented 6 years ago

@arobase-che I mean the “border” between lang and info. And I believe internal white-space should be kept as is. Just the “border” should be removed.

arobase-che commented 6 years ago

Ok so :

~~~ golang info

`lang='golang' info='info'`

But : 

`lang='go{info}' info=null`

It become a problem with that : 

 :(

Sorry if I don't understand you. We keep all characters until the first white-space (white-space excluded, so the white-space is the border no ?). The rest of the text is the info attributes and we trim it out (so the border is removed ?)

I believe too that internal white-space should be kept.
ChristianMurphy commented 6 years ago
~~~ go( this is a comment )
~~~

Which spec supports comments in the info string? Commonmark makes no mention of it. https://spec.commonmark.org/0.28/#example-112 Does one of the other standards support it?

wooorm commented 6 years ago

Yes, I believe that:

```js something


Should turn into: `"js", "something"`, `"js", "something"`, `"go(", "something)"`, `"go(", "s·om····ethi···ng)"`
arobase-che commented 6 years ago

@ChristianMurphy None i think. It's about the user's expected result i think. Some may want to use Kramdown Attribute lists.

No language has brackets in its name. So why make a error consciously ?

I will code it in any case but ... :s

ChristianMurphy commented 6 years ago

None i think. It's about the user's expected result i think

Maybe it could be an option or a plugin that could be enabled. As a user of remark myself, I would find output that doesn't match commonmark unexpected.

arobase-che commented 6 years ago

Ok, makes sense. Even if i think that CommonMark is wrong.

See you soon people ^^

arobase-che commented 6 years ago

I updated the description and the PR on remark. ^^

But travis still buging :cry: remarkjs/remark#345

ChristianMurphy commented 6 years ago

It probably started during the minor Travis CI outage. Restarted bugging travis instance, it passed this time.

arobase-che commented 6 years ago

\o/

Thank you :smile: