asciidoctor / asciidoctor-extensions-lab

A lab for testing and demonstrating Asciidoctor extensions. Please do not use this code in production. If you want to use one of these extensions in your application, create a new project, import the code, and distribute it as a RubyGem. You can then request to make it a top-level project under the Asciidoctor organization.
Other
104 stars 101 forks source link

Create lint extension #6

Closed ggrossetie closed 8 months ago

ggrossetie commented 9 years ago

That way writers could ensure that theirs documents respect conventions and best practices. @mojavelinux I'm sure you already have a list of best practices ? :smile:

Then it would be great to let users configure rules with a simple config file.

mojavelinux commented 9 years ago

We are absolutely on the same page about this. This topic has come up many times in the past, to which I've replied, "yes, we should definitely have a validator extension that does this". I'm so glad you filed the issue, because I keep forgetting to do it.

We're working on bringing over a set of best practices from documents we created for NFJS, the Magazine authors. Currently, that content lives here: https://github.com/nofluffjuststuff/nfjsmag-docs/blob/master/author-writing-style-and-syntax-guide.adoc

ggrossetie commented 9 years ago

I'm so glad you filed the issue, because I keep forgetting to do it.

Anytime, if you need some more work I can think of many more things !! Just kidding, I will gladly help :wink: I missed the extension API part in the 0.1.4 release but I see a lot of possibilities and it's a pleasure to prototype things...

sanmai-NL commented 8 years ago

I'm quite interested in this as well. A range of conventions could be checked. The conventions that are useful to check range from requiring simple up to technically difficult checks. E.g. checking whether the rendering of document happens with certain attributes set would be less difficult than finding incorrectly marked up hyperlinks. Finding unavailable hyperlinks would be easier than finding suboptimal heading structure.

Since documentation is normally part of a software release engineering process and should meet the concomitant QA criteria, it is natural to automatically run linters in addition to code linters.

sanmai-NL commented 8 years ago

@mojavelinux: what would be the way to go about programming such a linter? Would it best use an extension point?

mojavelinux commented 8 years ago

@sanmai-NL Very likely a hybrid approach will be needed. To check the content, such as broken links, I might parse the document to the AST and examine it from there (a treeprocessor). Since inline parsing occurs during conversion, it might be necessary to look at the final output (a postprocessor). To check syntax, we're likely going to need to look at the raw text (a preprocessor). However, the preprocessor is tricky since it isn't aware of the document structure and lacks a lot of context.

So it really depends on what you want to focus on. My recommendation is to set some goals and then to try to find ways to accomplish those goals. Little by little, we'll have a linter framework.

mojavelinux commented 5 years ago

I've started work on a plugin for textlint so we can leverage the validation ecosystem already available. It just requires creating a new parser for AsciiDoc that can be hooked into that framework. You can find an early prototype here: https://github.com/opendevise/textlint-plugin-asciidoc This will converege with the work @Mogztter has been doing on a new inline parser for AsciiDoc at https://github.com/Mogztter/asciidoctor-inline-parser. It's still early days, but things are starting to move.

mojavelinux commented 5 years ago

As I have stated elsewhere, after doing some analysis, I came to realize that it's not feasible or even sensible for the parser that handles conversion to also provide a framework for validation. These are very different concerns and should be handled with discrete tools. The reason is, a validator needs to know stuff about the document that the converter doesn't care about. And it needs to seek for things that the converter doesn't need or want to look for. So the validator would make the converter do a lot of extra work just for the purpose of validation. And the opposite is true as well. It's better that these tools evolve independently.

I'll give one example to give you an idea how I came to this conclusion. Consider the case of a missing blank line between a paragraph and a list. The converter will happily treat this as a single paragraph. But to the writer, this is an error. The list is not being recognized. But technically, nothing is wrong with the document. Maybe that is what the author intended. So a validator can use a rule to look for this type of syntax blunder, understand the intention, and suggest that a blank line is missing. That's just not something the converter should ever worry about. The converter is going to assume the document is correct and should be converted as is. If these tools can be allowed evolve independently, we can do a lot of these types of (optional) checks.

sanmai-NL commented 5 years ago

Checking can be done on the concrete syntax tree, whereas you are talking about the abstract syntax tree. Is there a grammar for Asciidoc(tor)? The linting tool can be implemented separately. Also, a more general linting tool could be made to operate on the HTML representation of technical documents.

mojavelinux commented 5 years ago

Checking can be done on the concrete syntax tree, whereas you are talking about the abstract syntax tree.

Because that's where I think it needs to be checked to be any good.

The linting tool can be implemented separately.

Exactly what I'm saying. But it will still need a parser implementation, which is what the textlint extension is for.

Is there a grammar for Asciidoc(tor)?

Not yet. The effort is picking up momentum though.

a more general linting tool could be made to operate on the HTML representation of technical documents.

Such tools already exist. See vale for one example. The problem I have with that approach is that there's no visibility into where the problem stems from in the AsciiDoc source, so it doesn't scale to large documentation sets.

silopolis commented 1 year ago

A tool like markdownlint would be awesome !

mojavelinux commented 8 months ago

A lint extension is not feasible. It needs to be a separate parser. The Asciidoctor parser is focused on parsing a valid AsciiDoc document and therefore doesn't capture all the information needed for a linter. I have linked to prototypes above. We could focus on developing those further, though they have a dependency on the AsciiDoc Language project producing a formal grammar on which that parsing can be based...and that work is still in progress.