IBM / zopeneditor-about

IBM Z Open Editor: File issues here!
https://ibm.github.io/zopeneditor-about
Apache License 2.0
49 stars 21 forks source link

COBOL first - Add support for fenced code block markdown syntaxe highlighter #428

Open FALLAI-Denis opened 1 month ago

FALLAI-Denis commented 1 month ago

Description of the enhancement requested

Need, use case: create documentation on COBOL programming using markdown files.

The syntax of the markdown language, .md files, allows you to declare fenced blocks of code, and to associate a programming language with them. In this case the block of lines is highlighted using the associated textmate grammar.

image

image

This programming language still needs to be recognized by the markdown language interpreter. For VS Code this interpreter appears to be natively included vscode-markdown-tm-grammar extension. Unfortunately the COBOL language is not managed there (but for example the SQL language is).

I tried to apply the solution given here: How to add custom language syntax highlighter to markdown code block in VSCode? and here: vscode-fenced-code-block-grammar-injection-example

I was partially successful in achieving the desired result:

What I have done :

...
        "languages": [
            {
                "id": "cobol",
                "aliases": [
                    "COBOL"
                ],
                "extensions": [
                    ".cbl",
                    ".cpy",
                    ".cob",
                    ".copy",
                    ".copybook",
                    ".cobol",
                    ".cobcopy"
                ],
                "configuration": "./cobol-textmate/cobol-language-configuration.json"
            },
            {
                "id": "cobol-markdown-injection"
            },
...
        "grammars": [
            {
                "language": "cobol",
                "scopeName": "source.cobol",
                "path": "./cobol-textmate/cobol.json"
            },
            {
                "language": "cobol-markdown-injection",
                "scopeName": "markdown.cobol.codeblock",
                "path": "./cobol-textmate/cobol-markdown-injection.json",
                "injectTo": [
                    "text.html.markdown"
                ],
                "embeddedLanguages": {
                    "meta.embedded.block.cobol": "cobol"
                }
            },
...
{
    "fileTypes": [],
    "injectionSelector": "L:text.html.markdown",
    "patterns": [
        {
            "include": "#fenced-cobol-code-block"
        }
    ],
    "repository": {
        "fenced-cobol-code-block": {
            "begin": "(^|\\G)(\\s*)(\\`{3,}|~{3,})\\s*(?i:(cobol)(\\s+[^`~]*)?$)",
            "name": "markup.fenced_code.block.markdown",
            "end": "(^|\\G)(\\2|\\s{0,3})(\\3)\\s*$",
            "beginCaptures": {
                "3": {
                    "name": "punctuation.definition.markdown"
                },
                "4": {
                    "name": "fenced_code.block.language.markdown"
                },
                "5": {
                    "name": "fenced_code.block.language.attributes.markdown"
                }
            },
            "endCaptures": {
                "3": {
                    "name": "punctuation.definition.markdown"
                }
            },
            "patterns": [
                {
                    "begin": "(^|\\G)(\\s*)(.*)",
                    "while": "(^|\\G)(?!\\s*([`~]{3,})\\s*$)",
                    "contentName": "meta.embedded.block.cobol",
                    "patterns": [
                        {
                            "include": "source.cobol"
                        }
                    ]
                }
            ]
        }
    },
    "scopeName": "markdown.cobol.codeblock"
}

Information returned by the tool Inspect Editor Tokens and Scopes:

Same for a SQL block code, highlighting native to VS Code:

Something is missing to render the syntax highlighting in preview mode... which remains the desired goal...

Thanks.

phaumer commented 1 month ago

Very nice idea. Thanks.

FALLAI-Denis commented 1 month ago

Thanks @phaumer,

Regarding syntax highlighting in preview mode, according to my research it is managed by the highlight.js component. This component natively supports a number of languages, but not COBOL.

There are highlight.js plugins for COBOL:

A new language can be registered for highlight.js by a call to its registerLanguage() function.

highlight.js is instantiated locally from markdown-language-features internal VS Code extension, markdownEngine.ts file, getMarkdownOptions function... no way to register a new language at this point...

To add a new language for preview in Markdown, you need to contribute to the markdown.markdownItPlugins extension point and in the activate function, and in principle return an updated Markdown-it object referencing a Markdown-it plugin dedicated to this language... Markdown-it plugin that does not exist...

Instead I overloaded the md.options.highlight function, so that it directly executes the highlight function of a highlight.js object internal to the extension if the requested language is COBOL, otherwise to execute the md.options.highlight function initially declared. In the extension I only load highlight.js/lib/core, which avoids loading all the default languages, and I just registered the plugin for COBOL language, (which will have to be replaced by another more compatible plugin). Despite this, there seems to be a lot of unnecessary code left at the highlight.js/lib/core import level, but it remains reasonable. I also had to declare a local function to replace md.options.highlight when its value is null, which in principle should not happen.

Which give:

extension.ts:

import * as vscode from 'vscode';
import type MarkdownIt from 'markdown-it';
import hljs from 'highlight.js/lib/core';
import hljsCOBOL from 'highlightjs-enterprisecobol';
hljs.registerLanguage('cobol', hljsCOBOL);

export function activate(context: vscode.ExtensionContext) {
    //console.log('Congratulations, your extension "markdown-cobol" is now active!');
    return {
        extendMarkdownIt(md: MarkdownIt) {
            const originalHighlight = md.options.highlight || noOriginalHighlight;
            md.options.highlight = (str: string, lang: string, attrs: string): string => {
                if (lang && lang.match(/\bcobol\b/i) {
                    return hljs.highlight(str, {language: lang, ignoreIllegals: true}).value;
                }
                return originalHighlight(str, lang, attrs);
            };
            return md;
        }
    };
};

// Should never be called... required for compilation purpose because md.options.highlight is declared optional
function noOriginalHighlight(str: string, lang: string, attrs: string): string {
    const preProcess = (str: string) =>
        str.replace(/\</g, '&lt;')
           .replace(/\>/g, '&gt;');
    return `<pre>${preProcess(str)}</pre>`;
}

export function deactivate() { }

package.json:

  ...
  "contributes": {
    "languages": [
      {
        "id": "cobol-markdown-injection"
      }
    ],
    "grammars": [
      {
        "language": "cobol-markdown-injection",
        "scopeName": "markdown.cobol.codeblock",
        "path": "./syntaxes/cobol-markdown-injection.json",
        "injectTo": [
          "text.html.markdown"
        ],
        "embeddedLanguages": {
          "meta.embedded.block.cobol": "cobol"
        }
      }
    ],
    "markdown.markdownItPlugins": true
  },
  ...
  "devDependencies": {
    "@types/markdown-it": "^14.1.1",
    "highlight.js": "^11.10.0",
    ...
  },
  "dependencies": {
    "highlightjs-enterprisecobol": "^1.0.5",
  }

image

The restitution of the syntactic colorization is not identical, and there are some errors (Program-Id)... I think we should create a more sophisticated highligth.js COBOL plugin which uses the options of the ZOE textmate grammar. But in the meantime “it (almost) does the job”!

Try it! markdown-cobol-0.0.2.vsix.zip