tree-sitter-grammars / tree-sitter-markdown

Markdown grammar for tree-sitter
MIT License
411 stars 52 forks source link

Text in square brackets conceals like a link #56

Closed storm closed 1 year ago

storm commented 2 years ago

Text contained in square brackets will be rendered as link, despite the absence of a link destination in parenthesis.

Code example

This [link] renders like a link when it should not.

Expected behavior When no link destination is provided square brackets render as text.

Actual behavior When no link destination is provided square brackets render as link.

MDeiml commented 2 years ago

This is unfortunately possible with treesitter, as it would require scanning the entire document for link reference definitions for every link. Indeed the block structure (inlcuding link reference defintions) is scanned before any links are parsed due to technical reasons, but for other technical reasons that information cannot be reused later. I can get into details if you want.

Other solutions would be:

  1. Request a feature in tree-sitter that would change that. In particular this would require passing paramters to injected languages.
  2. Escape the first bracket e.g. \[link]
  3. Write a custom predicate to filter links with no corresponding link reference definition. This is probably the preferred solution tough the output could theoretically be wrong in some edge cases. Also this would require writing code for your specific usecase, probably a editor plugin or some contribution to nvim-treesitter if you're using neovim.

So I'd suggest 3, but that means solving this outside this repo. I'd be happy to help tough.

storm commented 2 years ago

Thank you, for myself I think I can work around it. However, wouldn't it be desirable to make this work without any customization as it is not conforming to the CommonMark spec?

MDeiml commented 2 years ago

It would be desirable yes. But it's not really possible with the tree-sitter parsing framework.

Basically a node in the output of the parser can only depend on things that came before it. Think of the following situation

[link]
^ parser is currently here

... Rest of the file

Now there is two things that could be going on here. Either [link] does have a link reference definition somewhere like

[link]

[link]: http://example.com

in which case it should be rendered as a link. Or it doesn't like

[link]

Some more text
<end of file>

in which case it should be rendered like the literal text "[link]".

But the parser can't know without reading the rest of the file. It "has to" output a node for it "right now" though. So the conservative guess it that the user either wrote a link reference definition somewhere later in the file or maybe they plan to do later.

eddiebergman commented 1 year ago

I realize not much can be done but just adding this is a problem when using a python language server and parsing their markdown output with neovim, such as showing definitions in a floating window.


In python, it's common to have types in a docstring, i.e.

def hello_to(people: list[str]) -> None:
    """
    Parameters
    ----------
    people: list[str]
        People to say hi to
    """
    for person in people:
        print(f"hi {person}")

I use jedi-language-server which returns the following markdown rendering. (I think markdown rendering from lsps is the default?)

```python
def hello_to(people: list[str]) -> None:

Parameters

people: list[str] <- Problematic line People to say hi to

Full name: test.hello_to



In the case of python documentation, I would guesstimate that almost always for python inline documentation, you would rather not consider it a link as you showed.

---

I'm not sure anything can be done to trivially solve this and I was considering implementing your proposed option 3. in some hacky fashion for my own dotfiles.

I could also forward this issue to `jedi-language-server` and see if they would add an option to escape square brackets in documentation when rendering markdown. The author is quite keen on keeping `jedi-language-server` as lean as possible so I don't imagine there being much drive to implement it.

~~Lastly, if there is some method to disable concealing for specifically links or the `[` `]` chars, I would also be to create some easy hack to do so when rending these lsp signature helps. However I'm really not familiar enough with hoe nvim and treesitter interact with respect to concealment and how to do this, any points would be appreciated :)~~

Update:
I managed to solve this by doing `:TSEditQuery highlights markdown_inline` and commenting out the block at the bottom called `shortcut_link` which specifies the conceal.

Otherwise, it's beautiful output for markdown files and amazing work!

Best,
Eddie
MDeiml commented 1 year ago

Hey Eddie, thanks for your also providing the solution to your problem with everybody that encounters this in the future.

MDeiml commented 1 year ago

Just wanted to add to this that it's probably (haven't tried it) possible to change the highlighting of injected markdown (like here) by adding

local parser_config = require "nvim-treesitter.parsers".get_parser_configs()
parser_config.markdown_for_python = {
  install_info = {
    url = "https://github.com/MDeiml/tree-sitter-markdown",
    location = "tree-sitter-markdown",
    files = { "src/parser.c", "src/scanner.cc" },
    branch = "split_parser",
  },
}

parser_config.markdown_inline_for_python = {
  install_info = {
    url = "https://github.com/MDeiml/tree-sitter-markdown",
    location = "tree-sitter-markdown-inline",
    files = { "src/parser.c", "src/scanner.cc" },
    branch = "split_parser",
  },
}

and then editing the injection.scm of python and the "new" markdown_for_python respectively.

Otherwise as I said there is no way to change the parser to only detect shortcut links that also have a link definition, so I'm going to close this for now.