burodepeper / language-markdown

Add support for Markdown to Atom (including Github flavored, Markdown Extra, CriticMark, YAML/TOML front-matter, and R Markdown), and smart behavior to lists.
https://atom.io/packages/language-markdown
MIT License
119 stars 296 forks source link

Setext headings not supported #192

Open ferenczy opened 7 years ago

ferenczy commented 7 years ago

Hi guys,

according to the readme file, this package should support the CommonMark grammar specification, but I found out that it doesn't recognize the setext headings (such heading is enclosed in a span with classes syntax--text and syntax--md only, there's no syntax--heading class according to the Developer tools).

Would it be possible to implement it, please?

It should support H1 and H2 level headings only:

This is H1 heading
==================

This is H2 heading
------------------

GitHub also renders it properly:

This is H1 heading

This is H2 heading

Thank you.

ferenczy commented 7 years ago

OK, I just found in FAQ that setext headers are not supported, but still, would it be possible to implement it, please, or is there any technical reason against it?

burodepeper commented 7 years ago

Ah good, you found it ;)

I'm afraid the current way in which Atom parses grammar doesn't allow for setext headers to be detected. There might be a possibility for a hacky workaround (or custom package), but it's not feasible for me to invest time the time needed for that.

If you're (a little) familiar with Javascript and regular expressions --and don't mind getting your hands dirty-- I wouldn't mind pointing you in which direction to look to give it a go yourself.

ferenczy commented 7 years ago

Thank you for you message @burodepeper. Sure, I'm interested in trying to do it myself definitely, so it would be nice if you could provide me details about it, please.

burodepeper commented 7 years ago

I hope the following is enough to get you started.

You can add the patterns for matching setext headings to https://github.com/burodepeper/language-markdown/blob/master/grammars/repositories/blocks/headings.cson

Every time you update the grammar, you'll have to compile it and reload Atom to have the changes take effect. You'll also have to work in dev-mode. You can find details about those things here: https://github.com/burodepeper/language-markdown/blob/master/CONTRIBUTING.md

Use the following page as a reference for the regexp engine: https://manual.macromates.com/en/regular_expressions

Atom's implementation (or library or whatever) of the regexp engine has an important feature disabled: you can't match a newline character (or something along those lines). This where you'll run into difficulties. You have two ways of matching a pattern: 1) just a simple pattern with match, or 2) a pattern with a begin and an optional (!) end. If you browse through the grammar files, you'll see what types of things are matched with either of the two options.

The problem that you'll have to solve is matching a line of text, immediately followed by a line of dashes (or another allowed separator). You'd normally do that by searching for a pattern like (text) (newline character) (separator), and that is where the problem lies: you can't match that required newline character.

The problem with using begin and end is that end is optional. And a valid pattern for the begin of a setext heading is basically every line of normal text in a markdown document.

So, that's about the scope and context of what you're dealing with. Maybe there's a programmatic way of doing it. Maybe they've enabled some new flag in the regexp engine since I've looked into it; there has been work done on making the end non-optional, but I believe that effort was never finished. In any case, good luck!