dandavison / delta

A syntax-highlighting pager for git, diff, grep, and blame output
https://dandavison.github.io/delta/
MIT License
21.82k stars 364 forks source link

🐛 `is out of bounds of` due to newline characters in ripgrep output #1487

Open tbm opened 1 year ago

tbm commented 1 year ago

I am trying ripgrep in combination with delta based on your recent post in the ripgrep thread.

I get this:

thread 'main' panicked at 'byte index 63 is out of bounds of `https://en.wikipedia.org/w/api.php?action=help&modules=query

`', src/handlers/grep.rs:358:38
stack backtrace:

Reproduce:

cat x | delta

x

{"type":"match","data":{"path":{"text":"TODO"},"lines":{"text":"https://en.wikipedia.org/w/api.php?action=help&modules=query\n\n\n"},"line_number":4,"absolute_offset":148,"submatches":[{"match":{"text":"\n\n\n"},"start":60,"end":63}]}}

If I encode the & in the URL, it works.

I'm not sure if the bug is in delta or grip.

This is with delta 0.16.5.

dandavison commented 1 year ago

Thanks @tbm! This must be a delta bug. Your reproduction example is very helpful.

ippsav commented 1 year ago

I've been tinkering with this issue for a couple of minutes, i ll be making a pr for this soon to handle this issue

ippsav commented 1 year ago

@dandavison pr ready for review ! #1496

dandavison commented 1 year ago

@tbm @ippsav what's an example of making rg product output with multiple trailing newlines like this:

{"text":"https://en.wikipedia.org/w/api.php?action=help&modules=query\n\n\n"}

Does it require using the --multiline flag (-U)?

For example with input file

1

2

I can reproduce this:

rg --multiline --json '1\n\n' /tmp/in

and that crashes delta.

@ippsav is it usage of --multiline that your fix is addressing? I'm really sorry I've been so slow here; I didn't find this issue easy to think about and was low on time.

tbm commented 1 year ago

Does it require using the --multiline flag (-U)?

Sorry, I forgot to mention it. Yes, it requires -U.

Example:

65569:tbm@jirafa: ~/tmp/wiktionary] rg -U "\n\n"  xt
hread 'main' panicked at 'byte index 47 is out of bounds of `https://dumps.wikimedia.org/backup-index.html
`', src/handlers/grep.rs:358:38
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
x 
zsh: exit 101
65570:tbm@jirafa: ~/tmp/wiktionary] cat x

https://dumps.wikimedia.org/backup-index.html
dandavison commented 1 year ago

Thanks!

If I encode the & in the URL, it works.

Hm, I'm finding that your example crashes delta even with the two ? deleted; is it possible that when you experimented with encoding the & you also changed something about the presence of (escaped) newline characters in the rg output?

tbm commented 1 year ago

I think you're right. See the latest example I just posted. There the file contains a URL but not one that needs to be escaped.

So I think the title is wrong. It's related to newlines.

dandavison commented 1 year ago

No worries! That didn't lead to my confusion -- I'd already learned from @ippsav that it was \n rather than the ? but I was still finding the bug confusing to understand!

dandavison commented 1 year ago

For my future reference: a simple reproduction of the bug is

echo bar | rg -U --json 'bar\n'
{
  "type": "match",
  "data": {
    "path": {
      "text": "<stdin>"
    },
    "lines": {
      "text": "bar\n"
    },
    "line_number": 1,
    "absolute_offset": 0,
    "submatches": [
      {
        "match": {
          "text": "bar\n"
        },
        "start": 0,
        "end": 4
      }
    ]
  }
}
image

More complex examples arguably aren't highlighted correctly by delta, e.g.

echo 'bar\nbar\n' | rg -U --json 'bar\n'
{
  "type": "match",
  "data": {
    "path": {
      "text": "<stdin>"
    },
    "lines": {
      "text": "bar\nbar\n"
    },
    "line_number": 1,
    "absolute_offset": 0,
    "submatches": [
      {
        "match": {
          "text": "bar\n"
        },
        "start": 0,
        "end": 4
      },
      {
        "match": {
          "text": "bar\n"
        },
        "start": 4,
        "end": 8
      }
    ]
  }
}
image

(These are on branch https://github.com/dandavison/delta/pull/1496)