[FEATURE] Add Configurable Language Syntax

mvextensions / mvbasic

MultiValue Basic extension for Visual Studio Code

MIT License

30 stars 16 forks source link

[FEATURE] Add Configurable Language Syntax #14

Open itsxallwater opened 5 years ago

itsxallwater commented 5 years ago

Is your feature request related to a problem? Please describe. Currently, server\src\server.ts hard codes the language syntax file to use with this line:

filePath = filePath+"/../Syntaxes/MvLanguage.json";

With the requests for platform specific language syntax, it would be efficient to tie the selected syntax to the user's/workspace's configured platform.

Additionally, the package.json file informs on the grammar files to use for things like syntax highlighting, using the file client\Syntaxes\MVON.tmLanguage.json. The appropriate tmLanguage.json file would need to also be selected in conjunction with the MvLanguage.json file.

Describe the solution you'd like An acceptable pull request for this issue should abstract the selection of a language syntax to the platform specified within the extension's configuration settings and package.json. Bonus points for also allowing a user specified syntax as a dynamic overload, though currently that is not required.

itsxallwater commented 4 years ago

Branch 14-add-configurable-language-syntax has a working version of this that will load the language definition file depending on the language type selected in the Extension's configuration. Pictured below is the setting, with some messages visible in the output channel for the MV Basic Server.

Additional picture to show some sample jBASE code with the jBASE definition loaded up:

TODOs:

Dynamically load syntax highlighting
Dynamically load snippets

If you'd like to run with this version you can clone the code, target branch 14-add-configurable-language-syntax, press F5 to build and debug and go to town!

CC: @dthiot, @GrantHart, @ianmcgowan, @andrewcole50, @CharlesBarouch, @JeffTeter

itsxallwater commented 4 years ago

If you're keen to check out these updates in advance, I've got a copy of the installer posted to https://zumasyscorp-my.sharepoint.com/:u:/g/personal/mikew_zumasys_com/EVSwBfkqFetKnRkGBNfFXXMBvVL_ITUsM7wYKeqHrBAArw?e=itOorL

itsxallwater commented 4 years ago

Aggregating some notes here for the next chunk of work to sort out the dynamic syntax highlighting.

https://github.com/Microsoft/vscode/issues/68647 https://github.com/microsoft/vscode/issues/53885 https://stackoverflow.com/questions/56107128/dynamic-syntax-highlighting-in-vscode https://github.com/Microsoft/vscode/issues/585 https://github.com/microsoft/vscode/issues/77133

itsxallwater commented 4 years ago

This is partially addressed with the release of version 2.0.4 and the addition of the Language Type setting as referenced in the docs here: https://github.com/mvextensions/mvbasic/blob/1590634f06c9ce853a8660ea66fd3f2de22a4fe9/doc/Extension%20Guide.md#L274

Keeping the issue open as work is still necessary to complete dynamic syntax highlighting.

andrewcole50 commented 3 years ago

FWIW, the more I look into the language configurations the more it seems like this project is "hacking" the TextMate schema in order to obtain multiple languages in one. It seems to me like it may almost be a better practice to have the bare bones MVON and structure development as this extension while forking for other flavors (i.e. jBASE-BASIC, d3-BASIC, UniData-BASIC, etc.) with the forks only overwriting the Syntaxes directory.

A good example of this is the snippets functionality. I can not find in my research (and I mean...TextMate docs are horrible) how to configure different snippets based on settings. Now, this could be bc you should only need one snippets.json file for the language or it's out there and I can't find it. I've poked around a few times since I made them a couple years ago and always come up empty on how to implement them in accordance with this projects goals.

kpowick commented 3 years ago

it seems like this project is "hacking" the TextMate schema in order to obtain multiple languages in one.

I would say this is true. Language plugins for VS Code are to support a single language, not multiple versions or different languages. I don't know if we'll ever have the option with the single mvExtensions plugin to indicate which flavour of MV one is working on and have the linter, highlighter, and snippets work for only that selected language.

A good example of this is the snippets functionality. I can not find in my research (and I mean...TextMate docs are horrible) how to configure different snippets based on settings.

I just came across this proof-of-concept that may contain some useful ideas for snippets.

https://github.com/vscode-plugins/multilanguagehelloworld

Anyway, I think I agree with the idea of a MV "core" or framework from which all mvExtension flavours are built. -- mvExD3, mvExJB, mvExQM, mvExUn, etc. Unless, of course, there really is a way for a single extension to support multiple language syntaxes, snippets, etc.

andrewcole50 commented 3 years ago

I just came across this proof-of-concept that may contain some useful ideas for snippets.

https://github.com/vscode-plugins/multilanguagehelloworld

So while promising, it looks like it's fully language dependent according to the docs. https://code.visualstudio.com/api/references/contribution-points#contributes.snippets

BUT, that leads me to another option which I found from the above link. It looks like this extension can define multiple languages per the documentation here: https://code.visualstudio.com/docs/languages/identifiers

Anyway, I think I agree with the idea of a MV "core" or framework from which all mvExtension flavours are built. -- mvExD3, mvExJB, mvExQM, mvExUn, etc. Unless, of course, there really is a way for a single extension to support multiple language syntaxes, snippets, etc.

With the above comments in mind, maybe rather than a setting indicating which flavor of MV you are using it just define multiple different languages within the extension? I believe we're up to about six currently? (JB, QM, d3, UV, UD, MVON) This only seems to affect things contained in the Syntaxes directory and not the linter itself but there may be a way to either define a different linter based on language or from the linter itself determine which language it being used and implement different methods based on that.

itsxallwater commented 3 years ago

Quick preface here that, while I've had this assigned to me, I haven't had a chance to do anything further with it yet so if one of you feels like you have a path to getting this accomplished, I'm happy to not get in the way.

If I were going to sit down and start working on it, though, I planned to pick things up with:.

@itsxallwater I found a workaround using setDecorations method, although it requires much more work. I need to "paint" keywords of a language that is created in a runtime. Also, grammar changes should repaint a document, here is the example how it works.

Here are some implementation details if you find this helpful.

Originally posted by @danixeee in https://github.com/microsoft/vscode/issues/68647#issuecomment-561411907

We aren't quite creating the language at runtime, but we are in theory deciding on the language then, so it seemed like a possible solution.

Since then, the VS Code Extension API has been updated to support Semantic Highlighting. Relevant quote:

Semantic tokenization allows language servers to provide additional token information based on the language server's knowledge on how to resolve symbols in the context of a project. Themes can opt-in to use semantic tokens to improve and refine the syntax highlighting from grammars. The editor applies the highlighting from semantic tokens on top of the highlighting from grammars.

In this case, we're not quite a theme, though we could act like one. Our base grammar definition could be distilled to the subset of each flavor's grammar that is universal across the flavors, and semantic tokenization could be used to inject the flavor-specific nuances.

None of this settles the issue for Snippets, but honestly we may need to give them another look anyway. As it stands we have a lot of overlap between the language definitions and the snippet definitions and while each provide different value, it may confuse the IntelliSense module a bit. Here's an example of FOR:

snippets

kpowick commented 3 years ago

DOTADIW - "Do One Thing and Do It Well", comes to mind. And maybe the KISS principle too.

My concern is that you'll end up with a monolithic extension that has a myriad of corner cases to code around that make understanding and participation in development more difficult than necessary. It could also make bug-hunting troublesome.

If each mvBASIC language extension focused on a particular language "flavour", one could code against it knowing that their changes could not potentially introduce bugs into another mv language module.

Yes, there would likely be duplicated code, but some of that could be alleviated with a base/core module from which each language project is derived.

The end user would also be able to define their settings that tell MSVC which language (mvExtension XX) to use based on the OS folder(s) in use for editing. Most of us probably do this now because mvBASIC programs follow no prescribed naming standard, and are unlikely to do so in the future.

Maybe supporting multiple languages in a single extension is easier today. I also researched a while back and found no satisfactory approach, let alone meaningful examples. Everything seemed to be a bit of a hack with the disclaimer, "it should be possible".

itsxallwater commented 3 years ago

I'm open to it. Note in https://github.com/mvextensions/mvbasic/issues/120 that we're talking about the same concept of splitting the extension up to simplify, but in that issue focusing more on the various connectors available (MV Gateway, AccuTerm and now Linkar).

TL;DR yes, continuing to separate considerations here into a series of extensions with a shared base may be do-able and acceptable. Especially if we take care to get auto-builds and publishing setup :)

andrewcole50 commented 3 years ago

So I did some digging and I think we can actually accomplish this without splitting up the languages. I believe we're just not utilizing our server typescript to it's fullest extent. Our current set up has the server checking the language setting and pulling a language definition json file (jBASELanguage.json, etc.) to then load into IntelliSense. Well that same method should be able to insert auto completions as well (a la snippets). If it works the way I'm understanding then we just need to define our language json files in more detail and load them into IntelliSense slightly differently in TS.

I'll try to work on a proof of concept this week.

andrewcole50 commented 3 years ago

The latest PR on here #148 needs testing. In theory that should work but I can't seem to get my builds to work. Not just for this PR but when I build the extension it never seems to pick up changes made in server.ts. According to the documentation, the code should work unless I goofed somewhere. The PR also accomplishes a structured naming schema for language definitions. Included is a json file of orphaned snippets from my script to convert them into language files. If the typescript code works I can go through and clean those all up.

itsxallwater commented 3 years ago

I've seen the process of building and running from code behave oddly when the extension is still installed via the store. Perhaps uninstall that before building and running? In particular, I got tripped up one time when it kept mounting the installed extension's code to the debugger rather than the source code I was modifying and compiling. It presented the same way--was like my changes were nowhere to be found, until I realized what was going on.

andrewcole50 commented 3 years ago

@itsxallwater pretty sure that's not it. I've done it in both my Windows 10 and Ubuntu VMs with no luck while my host VSCode was shutdown. I'm guessing it's something with node that I don't have right shrug

itsxallwater commented 3 years ago

What command are you using to build and run?

andrewcole50 commented 3 years ago

@itsxallwater

With a fresh clone, run "npm install" with no errors. Then try to run either "Launch Client" followed by "Attach to Server" or the prebuilt package to run both. At this point I get an error that looks like:

Activating extension 'mvextensions.mvbasic' failed: Cannot find module 'c:\src\mvbasic\client\out\extension'
Require stack:
- c:\Users\andrewcole\AppData\Local\Programs\Microsoft VS Code\resources\app\out\vs\loader.js
- c:\Users\andrewcole\AppData\Local\Programs\Microsoft VS Code\resources\app\out\bootstrap-amd.js
- c:\Users\andrewcole\AppData\Local\Programs\Microsoft VS Code\resources\app\out\bootstrap-fork.js.

So if I try a compile to create that out dir I get:

PS C:\src\mvbasic> npm run compile
Debugger attached.

> mvbasic@2.1.2 compile C:\src\mvbasic
> tsc -b

Debugger attached.
client/node_modules/sync-request/lib/FormData.d.ts:4:21 - error TS2304: Cannot find name 'Blob'.

4     value: string | Blob | Buffer;
                      ~~~~

client/node_modules/sync-request/lib/FormData.d.ts:9:41 - error TS2304: Cannot find name 'Blob'.

9     append(key: string, value: string | Blob | Buffer, fileName?: string): void;
                                          ~~~~

Found 2 errors.

Waiting for the debugger to disconnect...
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! mvbasic@2.1.2 compile: `tsc -b`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the mvbasic@2.1.2 compile script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     C:\Users\andrewcole\AppData\Roaming\npm-cache\_logs\2021-03-09T18_21_53_349Z-debug.log
Waiting for the debugger to disconnect...
PS C:\src\mvbasic>

Eventually in my testing I ran prepublish and webpack and after one of them VSCode shows that I'm connected to the server but it sure doesn't seem like the server.ts is running/responding (No linting, snippets, keyword suggestions, etc).

Disclaimer: I know almost nothing about node. I know javascript and by extension I can figure out typescript but node is something I've never delved into.

kpowick commented 3 years ago

I work on a Mac, but have experienced the same problem where your changes don't seem to be picked-up between debug sessions. One thing I do that seems to help is to delete the contents of ..server/out and ../client/out before launching a debug session. This seems to ensure a rebuild is completed, and it's easy to verify.

andrewcole50 commented 3 years ago

I work on a Mac, but have experienced the same problem where your changes don't seem to be picked-up between debug sessions. One thing I do that seems to help is to delete the contents of ..server/out and ../client/out before launching a debug session. This seems to ensure a rebuild is completed, and it's easy to verify.

If I try on my Mac I don't even get that far haha. With the firewall off I get this (both before and after clearing out the out dirs):

Error processing attach: Error: Could not connect to debug target at http://localhost:6009: Promise was canceled
at e (/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/ms-vscode.js-debug/src/extension.js:1:110481)
at async t (/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/ms-vscode.js-debug/src/extension.js:59:49891)
at async P.launch (/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/ms-vscode.js-debug/src/extension.js:1:131583)
at async t.Binder.captureLaunch (/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/ms-vscode.js-debug/src/extension.js:59:126573)
at async t.Binder._launch (/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/ms-vscode.js-debug/src/extension.js:59:126124)
at async Promise.all (index 5)
at async t.Binder._boot (/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/ms-vscode.js-debug/src/extension.js:59:125384)
at async t.default._onMessage (/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/ms-vscode.js-debug/src/extension.js:1:83964)

I normally try not to use my Mac for playing with this extension bc I rely on VS Code and the "stock" extension for my normal work in jBASE. But I figured I'd try it on my Mac to see.

kpowick commented 3 years ago

I don't really follow that output or the reasons for it. Working on the MV extensions project has just "worked" for me. I've also never had to play around with my firewall settings.

Did you initially set up the project according to the steps in the Development Intro guide? https://github.com/mvextensions/mvbasic/blob/main/doc/DeveloperIntro.md

I may just try an new install on a windows VM for testing and see if anything weird happens.

itsxallwater commented 3 years ago

Relevant bit from the thread in the PR: https://github.com/mvextensions/mvbasic/pull/148#issuecomment-796433410

TL;DR I think that branch was based off main not develop and it was missing a key client/tsconfig.json change to resolve that compilation issue about Blobs with one of our dependencies. Once I patched that in I was able to run @andrewcole50's changes with great success.

I think that leaves dynamic highlighting as the lone to do here.

itsxallwater commented 3 years ago

So I have a POC of dynamic highlighting done via Semantic Highlighting. It's a little bit of a bummer because the implementation has to be inside of the language extension and not the language server. As a result, it also means we'll need to tell people to flip this configuration setting to enable it:

  "editor.semanticTokenColorCustomizations": {
    "enabled": true
  }

Check it out. MAKETIMESTAMP is a jBASE BASIC function that's pretty unique, and in my POC I'm flagging that for semantic highlighting specifically as a function. If you peek the Microsoft guide we basically have all of the token categories available to us here so the world is our oyster.

issue-14

The big bummer, as I see it, is that this isn't offloaded to the language server to deal with. That would be convenient because we're already parsing the document there. Semantic Highlighting being done extension side means that our final implementation is going to require that we also parse the document on that side, albeit only to flag specific tokens that will require special highlighting.

I'm not going to push this code quite yet unless someone wants to run with things, but the gist of it (again, in extension.ts) is:

const tokenTypes = new Map<string, number>();
const tokenModifiers = new Map<string, number>();

const legend = (function () {
    const tokenTypesLegend = [
        'function'
    ];
    tokenTypesLegend.forEach((tokenType, index) => tokenTypes.set(tokenType, index));

    const tokenModifiersLegend = [
        'MAKETIMESTAMP'
    ];
    tokenModifiersLegend.forEach((tokenModifier, index) => tokenModifiers.set(tokenModifier, index));

    return new vscode.SemanticTokensLegend(tokenTypesLegend, tokenModifiersLegend);
})();

...

interface IParsedToken {
    line: number;
    startCharacter: number;
    length: number;
    tokenType: string;
    tokenModifiers: string[];
}

class DocumentSemanticTokensProvider implements vscode.DocumentSemanticTokensProvider {
    async provideDocumentSemanticTokens(document: vscode.TextDocument): Promise<vscode.SemanticTokens> {
        const allTokens = this._parseText(document.getText());
        const builder = new vscode.SemanticTokensBuilder();
        allTokens.forEach((token) => {
            builder.push(token.line, token.startCharacter, token.length, this._encodeTokenType(token.tokenType), this._encodeTokenModifiers(token.tokenModifiers));
        });
        return builder.build();
    }

    private _encodeTokenType(tokenType: string): number {
        if (tokenTypes.has(tokenType)) {
            return tokenTypes.get(tokenType)!;
        } else if (tokenType === 'notInLegend') {
            return tokenTypes.size + 2;
        }
        return 0;
    }

    private _encodeTokenModifiers(strTokenModifiers: string[]): number {
        let result = 0;
        for (let i = 0; i < strTokenModifiers.length; i++) {
            const tokenModifier = strTokenModifiers[i];
            if (tokenModifiers.has(tokenModifier)) {
                result = result | (1 << tokenModifiers.get(tokenModifier)!);
            } else if (tokenModifier === 'notInLegend') {
                result = result | (1 << tokenModifiers.size + 2);
            }
        }
        return result;
    }

    private _parseText(text: string): IParsedToken[] {
        const r: IParsedToken[] = [];
        const lines = text.split(/\r\n|\r|\n/);
        for (let i = 0; i < lines.length; i++) {
            const line = lines[i];
            const lineTokens = line.trim().split(' ');
            lineTokens.forEach(token => {
                const tokenData = this._parseTextToken(token);
                if (tokenData.tokenType === "function") {
                    r.push({
                        line: i,
                        startCharacter: line.indexOf(token),
                        length: token.length,
                        tokenType: tokenData.tokenType,
                        tokenModifiers: tokenData.tokenModifiers
                    })
                }
            })
        }
        return r;
    }

    private _parseTextToken(text: string): { tokenType: string; tokenModifiers: string[]; } {
        const result = { tokenType: "", tokenModifiers: [""] }
        if (text === "MAKETIMESTAMP") {
            result.tokenType = "function"
            result.tokenModifiers.push(text)
        }
        return result;
    }
}

As I see it, the to do list to implement this in earnest:

1) Compare all of our *.tmLanguage.json files to create one base mvbasic.tmLanguage.json that only includes keywords, constants, classes, etc. that are shared across all flavors. 2) Take the flavor-specific differentials and build out a second dataset in JSON a la:


[
  "languageType": "jBASE",
  "tokens": [
    {
      "token": "MAKETIMESTAMP",
      "type": "function"
    },
    ...
  ],
  "languageType": "D3",
  "tokens": [
    ...
  ],
  ...
]
3. Modify the code above to use the current `MVBasic.languageType` setting to parse the document for tokens that are listed for that particular language type in the JSON file from above, binding the designated types accordingly.

I don't think the code will be terrible here. I used Microsoft's example from https://github.com/microsoft/vscode-extension-samples/tree/main/semantic-tokens-sample to whack this together. I think the bulk of the work is going to be on consolidating down to a new baseline `.tmLanguage.json` file and creating this new JSON file with the language/flavor specific stuff.

andrewcole50 commented 3 years ago

Re #148

Wanted to just roll the convo into this since it's relevant to the bigger picture. I closed the other PRs since all of the previous changes are rolled into #148. I see that the snippets worked partially. I wasn't sure if the $vars or the \n and \t would work as it did in a vanilla snippets file. I'm going to try and fix my compile issue and see if I can fix that.

But knowing that the snippets can be inserted via language files I can start adding in the orphaned snippets. I mostly left them out bc they require me manually reviewing and adding them. I don't have a problem doing that, but I wanted to make sure the concept worked before putting the time in.

Re @itsxallwater's latest post

The semantic highlighting (which seems weird to me bc those same "classes" are available in TextMate but VSC treats them different? But that's neither here nor there) brings up an interesting thought. For semantic highlighting you need JSON language definitions. I would propose those get rolled into the coming LANG_{flavor}.json files OR as you show above, roll all of those into one big definitions file a la:

{
    "jBASE": {
        "Type": "jBASE PickBASIC",
        "Keywords": [ snippets array ],
        "tokens": [ tokens array ]
    }, 
    "OpenQM": { ...},
    etc..
}

Lastly, the idea of keeping only overlapping keywords, functions, etc. brings me to a related question. The definitions in *.tmLanguage.json probably need to be more thought out as far as token classes. I see a slight mixture of keywords and functions in some "keyword.control". In your above example...MAKETIMESTAMP should already be highlighted according to what I'm looking at. I think in one of my closed PRs I mentioned that the tmLanguage files define mv functions and statements as "support.class" where and I think they should be "support.function" for proper highlighting. I know the theme I use (Darcula) does not highlight "support.class" and when I change that I get proper highlighting.

itsxallwater commented 3 years ago

Couple thoughts:

I think the preferred path is via the TextMate language definition file but this Semantic Highlighting stuff available to the extension side allows users (or themes!) to customize/overload. That's helpful for our purposes because there's no way to dynamically re-bind the .tmLanguage.json file association right now since it is specified in the package.json and at that point, we don't know what languageType the user wants. I love the idea of trying to bring this info into the LANG_{flavor}.json files FWIW.
I agree with you that the .tmLanguage.json side needs some clean-up related to the correct keyword/token -> class/type associations. In my example above, MAKETIMESTAMP would only have been highlighted if I had swapped out the mvon.tmLanguage.json file for the jBASE.tmLanguage.json file manually on my system. That's essentially the problem we'll be solving here, that there's no programmatic way to flip between those .tmLanguage.json files at runtime when the user changes their languageType setting. Using this Semantic Highlighting feature we can define one .tmLanguage.json file to cover the universal bits and then inject the flavor-specific additions with Semantic Highlighting.

One other consideration here--it's not lost on me this feature won't come into play for a lot of users. For people working at a company running flavor X, they are not really liable to ever change their languageType unless their company considers converting to another flavor. That user group isn't really our target audience for this change; rather, the people who will appreciate that are the consultants, vendors, etc. that over the course of a single day may touch several different flavors. For that group it becomes really important that the highlighting moves with the setting change without needing to manipulate the extension files behind the scenes.