mvdan / sh

A shell parser, formatter, and interpreter with bash support; includes shfmt
https://pkg.go.dev/mvdan.cc/sh/v3
BSD 3-Clause "New" or "Revised" License
7.35k stars 348 forks source link

Sequential Backslashes are removed during interpretation. #1106

Closed cfogrady closed 3 weeks ago

cfogrady commented 1 month ago

Sequential Backslashes seem to be reduced to a single backslash. Is there a purposeful reason behind this or am I correct in believing this to be a bug?

Example: gosh -c 'echo "hello \\\\\\ world"' yields hello \ world

Running the same thing in bash: bash -c 'echo "hello \\\\\\ world"' yields hello \\\ world matching my expectations.

In code:

func TestMvdanShellPreservesSequentialBackslashes(t *testing.T) {
    parser := syntax.NewParser(syntax.Variant(syntax.LangBash))
    prog, _ := parser.Parse(strings.NewReader(`echo "Hello \\\\\\ World"`), "$0")
    writer := &bytes.Buffer{}
    bash, _ := interp.New(interp.StdIO(nil, writer, writer))
    bash.Run(context.Background(), prog)
    want := `Hello \\\ World`
    if writer.String() != want {
        t.Errorf(`echo "Hello \\\\\\ World" = %v, want %v`, writer.String(), want)
    }
}

I believe this is an issue in the interp as running a syntax.DebugPrint on prog yields:

Parts: []syntax.WordPart (len = 1) {
.  .  .  .  .  .  .  .  .  0: *syntax.Lit {
.  .  .  .  .  .  .  .  .  .  ValuePos: 1:7
.  .  .  .  .  .  .  .  .  .  ValueEnd: 1:25
.  .  .  .  .  .  .  .  .  .  Value: "Hello \\\\\\\\\\\\ World"
.  .  .  .  .  .  .  .  .  }
.  .  .  .  .  .  .  .  }
mvdan commented 1 month ago

Thanks for reporting, this does seem like a bug.

Out of curiosity, how did you find this? Are you using the interpreter via the Go API?

cfogrady commented 1 month ago

Yup. Using the interp Go API for some shell scripts with gnarly arguments including some regexps. Some of the regexps were off after going through my program and after debugging, I managed to localize the issue to the library as above.

cfogrady commented 1 month ago

Further investigation seems to indicate this is a problem specific to double quoted literals. gosh -c "echo 'hello \\\\\\ world'" yields what I would expect: hello \\\ world.

cfogrady commented 1 month ago

I'll prep a PR after lunch. Offending code seems to be: https://github.com/mvdan/sh/blob/392da98b03853d889822bc2f62d61f4f92f37fa4/expand/expand.go#L505C4-L518C5

Changing to the below seems to fix it and still pass all the test cases I could think of as relevant.

buf := cfg.strBuilder()
for i := 0; i < len(s); i++ {
    b := s[i]
    if b == '\\' && i+1 < len(s) {
        switch s[i+1] {
        case '"', '\\', '$', '`': // special chars
            i++
            b = s[i] // write the special char, skipping the backslash
        }
    }
    buf.WriteByte(b)
}
cfogrady commented 4 weeks ago

Got side tracked by some other issues Friday afternoon. PR to fix the backslash issue: https://github.com/mvdan/sh/pull/1107