jupyrdf / graph-lsp

Jupyter-lsp extensions for graph langugages.
BSD 3-Clause "New" or "Revised" License
3 stars 1 forks source link

Fragment URIs trigger comments in syntax highlighting #8

Open zwelz3 opened 2 years ago

zwelz3 commented 2 years ago

When using fragmens in URIs, the lsp does not identify this and treats it as a comment. If you can point me to where this happens in the code, I can probably add a naive regex to prevent this from occuring.

Example: image

dfreeman06 commented 2 years ago

We are using the stardog language servers which in uses the TurtleParser from https://github.com/stardog-union/millan/tree/master/src/turtle. My guess for the regex change would be somewhere in the tokens.ts.

nrbgt commented 2 years ago

the first line almost certainly isn't part of the upstream grammar, and might need further refining... though i don't recall where it belongs: the interface between the two languages is part is handled by jupyterlab-lsp, so might actually be coming from python or treated as a bash fragment

nrbgt commented 2 years ago

it's possible to provide a replacer function, but it's semi-difficult to anticipate what it should be for the general case.

it almost seems like making the whole thing a comment would be the most consistent.

zwelz3 commented 2 years ago

it almost seems like making the whole thing a comment would be the most consistent.

It makes all content after the fragment a comment. Would you consider this the best behavior (based on your quoted comment)?

dfreeman06 commented 2 years ago

Thanks @nrbgt! I did not read @zwelz3's original issue closely enough.

nrbgt commented 2 years ago

consider this the best behavior

with syntax like magics that don't belong to any grammer, other than what your kernel happens to understand at that moment, it's hard to quantify what is "best," even if what we have is "wrong." So maybe "least worst behavior" is what we're looking for.

all content after the fragment

as in the content of the cell as well? that would be strictly wrong.

a comment.

this was an example: point is it should probably just all be one thing, if neither the python nor turtle grammars can parse it.

The heavy bat would be to create a new sub-mode or multiplexing mode.

An interesting upstream change would be to add more definition of mode details (e.g. a set of codemirror tokens for some of the capture groups) for an extractor... but that's actually pretty challenging for the general case.

zwelz3 commented 2 years ago

Ahh ok. The specifics of syntax magics are unknown to me. It only makes the remainder of the line a comment. I think we can close this unless there is a decent consensus that the change "belongs" in graph-lsp.

nrbgt commented 2 years ago

i mean, i'll take a look... but this is a general class of problem

nrbgt commented 2 years ago

So by defining some new mimetypes, modes, etc. it's relatively easy to make the whole line one token type:

Screenshot from 2022-04-06 12-41-26

Making it good is another story... especially as we don't actually ship a real magic.

The code for the above.

export async function installTurtle(_CodeMirror: any) {
  await Mode.ensure('turtle');

  _CodeMirror.defineMode(
    'iturtle',
    (config: CodeMirror.EditorConfiguration, modeOptions?: any) => {
      const mode = _CodeMirror.getMode(config, { ...modeOptions, name: 'turtle' });
      const token = (stream: any, state: any) => {
        if (stream.sol() && stream.peek() === '%') {
          stream.match(/^%%(ttl|turtle)/);
          stream.skipToEnd();
          return 'header';
        }
        return mode.token(stream, state);
      }
      return { ...mode, token };
    },
    'turtle'
  );

  _CodeMirror.defineMIME('text/x-iturtle', 'iturtle');
  _CodeMirror.modeInfo.push({
    ext: ['ittl'],
    mime: 'text/x-iturtle',
    mode: 'iturtle',
    name: 'iturtle',
  });
}
nrbgt commented 2 years ago

Here's a WIP branch off #7: https://github.com/jupyrdf/graph-lsp/compare/master...nrbgt:fancy-magic-highlighting?expand=1