mvdan / sh

A shell parser, formatter, and interpreter with bash support; includes shfmt
https://pkg.go.dev/mvdan.cc/sh/v3
BSD 3-Clause "New" or "Revised" License
7.16k stars 339 forks source link

shell: issues with shell.Fields and backslashes on Windows #356

Closed andreynering closed 5 years ago

andreynering commented 5 years ago

Hey there!

Two bugs regarding backslashes on Windows were caught on https://github.com/go-task/task/pull/170.

package main

import (
    "fmt"

    "mvdan.cc/sh/shell"
)

func main() {
    fields, err := shell.Fields("cmd\\task", nil)
    if err != nil {
        panic(err)
    }
    fmt.Println(fields)
}

This outputs cmdtask instead of the expected cmd/task on Windows.

Maybe this is by design, but in this case this should be documented, and it should probably output cmd\task instead of cmdtask.

package main

import (
    "fmt"

    "mvdan.cc/sh/shell"
)

func main() {
    fields, err := shell.Fields("~", nil)
    if err != nil {
        panic(err)
    }
    fmt.Println(fields)
}

This outputs C:UsersAndrey instead of the expected C:/Users/Andrey.

mvdan commented 5 years ago

Thanks for opening this issue. It's hard for me to debug issues on Windows, as I don't have a machine running that system, so it might take me a while to figure this out. Help welcome.

Also, have you encountered the same issues without using GitBash? These layers of abstractions could be changing the environment enough to break some things. For example, I've seen Bash on Windows treat environment variables as case sensitive, while they're case insensitive on Windows itself.

andreynering commented 5 years ago

For example, I've seen Bash on Windows treat environment variables as case sensitive, while they're case insensitive on Windows itself.

That's surprising. 🤔

Have you encountered the same issues without using GitBash? These layers of abstractions could be changing the environment enough to break some things.

I haven't tried it outside GitBash, but I think this specific bug probably happen due Gosh itself, maybe by only searching / os path while expanding while ignoring \, or something like that.

mikynov commented 5 years ago

I've just tried GitBash 2.20.1.windows.1 and CMD 10.0.16299.904 and there is no difference. cmd\task expanded to cmdtask ~ expanded to C:Usersmikynov Thanks for taking care about this issue.

mvdan commented 5 years ago

Apologies for the slowness; I've been able to reproduce this on Linux via Wine.

The tricky part is that tons of tests fail on Wine, because it's a barebones Windows setup. No bash, no coreutils, et cetera. Up until now, we did rely on those, as Windows on Travis does have them.

So right now I'm working on making all tests properly pass on Wine. Once that's done, it's going to be trivial for me to debug simple Windows issues. See https://twitter.com/mvdan_/status/1096176519746109440

In any case; I'll probably have a commit cherry-picked into the stable branch sometime soon.

andreynering commented 5 years ago

Hi @mvdan,

I just did some quick debugging and found a if block that removes backslashes: https://github.com/mvdan/sh/blob/ba273b7c3288fe6934817035be95a491436f0169/expand/expand.go#L514-L525

Commenting this block makes both of the above scripts work. The path is returned with backslashes instead of forward slashes, though. Not sure if that is the expected behavior.

This block is probably there for a reason, though. I just don't know why. 🙂 I also didn't tried on Mac/Linux to check if any test fail.

In theory we could just use filepath.ToSlash(p) if we're sure it's a path, but I suppose that could break other strings that could have characters like \n.

mvdan commented 5 years ago

Yes, I too found the piece of code causing the bug. And it is indeed on purpose. That's also part of why this fix is tricky.

What this code does is get rid of escape characters. For example, on bash:

$ echo foo\bar
foobar

We have two options. The first is to keep things as is, and tell users to do shell.Fields("cmd\\\\task", nil) when they really do mean the Windows path separator.

The second is to not support backslash escapes on Windows. That would fix the cases like the one in this bug, but it would also mean that the shell would behave differently depending on the operating system. For example, on bash:

$ echo foo\$0
foo$0

The way we currently implement this, it always prints the same as bash. If we made the change to fix the case in the original comment, should we print foo\$0, or foo\gosh? If we do the former, path separators can affect the syntax, which is not ideal. If we do the latter, we'd have to modify the syntax package to have a "windows mode" so that backslashes are a regular character, and don't escape the following character.

I don't see a clear winner. No solution is going to be perfect, as we're retrofitting a Unix shell on a Windows system. I just need to figure out which solution is going to be the simplest for the user, and the easiest for me to maintain. As it is now, I think the current approach is quite simple.

andreynering commented 5 years ago

@mvdan It would be interesting to check what Git Bash on Windows does.

I think I'd prefer option 1, i.e. just document that backslashes on shell.* functions are not supported and ask users to use filepath.ToSlash if the variable is supposed to be a path.

Having to disallow backslash escapes on Windows wouldn't be nice.

andreynering commented 5 years ago

FYI: Git Bash have the same behavior as this lib currently, i.e., it requires the use of / and \s are just removed from arguments.

That reinvorces my last comment: this lib should probably keep this specific bahevior, but have this documented.


~ returning a path without slashes is still a problem, though. And I think I have an idea on why it work on Git Bash.

Git Bash:

$ pwd
/c/Users/andre/

Gosh:

$ gosh -c pwd
C:\Users\andre

Git Bash treats all paths with forwards slashes internally, so it doesn't conflict with the behavior of backward slashes.

mvdan commented 5 years ago

~ returning a path without slashes is still a problem, though.

Completely agreed; this needs a fix one way or another.

I'm not sure if always using unix-y path separators is a good idea. I need to do some research.

mvdan commented 5 years ago

I think I'd prefer option 1, i.e. just document that backslashes on shell.* functions are not supported and ask users to use filepath.ToSlash if the variable is supposed to be a path.

Just to clarify, this is not what option 1 is. Backslashes are supported, but carrying their POSIX Shell syntax meaning. To write Windows paths, one simply needs to ensure that the backslashes aren't merely used to escape the following character:

$ echo C:\\foo\\bar
C:\foo\bar
$ echo 'C:\foo\bar'
C:\foo\bar

Of course, using forward slashes can be an option, but I presume some Windows programs will still expect backslashes as path separators.

mvdan commented 5 years ago

Tilde expansion should be fixed in v2 now, and I've added a short note at the top of the godoc; see https://godoc.org/mvdan.cc/sh/shell.

I backported the fix from v3 to v2, since it was reasonably small and safe. But I won't be tagging a v2 release right now. The vast majority of users use cmd/shfmt or syntax, and the two bugfixes in the master branch have only affected the interpreter on Windows. So I don't want to ping everyone (including package managers) to do a bugfix update when, realistically, only go-task would really benefit from the fixes.

So please pin the latest v2 commit from master for now :)

andreynering commented 5 years ago

@mvdan Thanks!