NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.55k stars 1.5k forks source link

nix-shell shebang can't use node (or any language with `#` comments?) as an interpreter #2570

Closed nicknovitski closed 5 years ago

nicknovitski commented 5 years ago

The second shebang line is passed to the interpreter, and is not a valid comment in javascript.

test.sh:

#! /usr/bin/env nix-shell
#! nix-shell -i node -p nodejs

console.log("oh no")
$ ./test.sh
/Users/nick/Source/test.sh:2
#! nix-shell -i node -p nodejs
^

SyntaxError: Invalid or unexpected token
    at createScript (vm.js:80:10)
    at Object.runInThisContext (vm.js:139:10)
    at Module._compile (module.js:617:28)
    at Object.Module._extensions..js (module.js:664:10)
    at Module.load (module.js:566:32)
    at tryModuleLoad (module.js:506:12)
    at Function.Module._load (module.js:498:3)
    at Function.Module.runMain (module.js:694:10)
    at startup (bootstrap_node.js:204:16)
    at bootstrap_node.js:625:3

I was surprised that this was the case: obviously neither shebang is part of the script, I wouldn't expect either to be part of the input to the other interpreter.

grahamc commented 5 years ago

Unfortunately (and we'll see why later,) the shebang is always passed to the interpreter. For example, here the script uses the interpreter cat:

$ ./test.sh                                                                 
#!cat
hello!

$ cat ./test.sh 
#!cat
hello!

In fact, if we use the binary echo as the interpreter, the nature of how shebangs work is revealed to us:

$ cat ./test.sh                   
#!echo
hello!

$ ./test.sh 
./test.sh

When using a shebang, the name of the file containing the shebang is the first argument to the interpreter specified by the shebang. So when test.sh uses the shebang echo, the program echo ./test.sh is called. When test.sh used cat, cat ./test.sh is executed.

Now, the mystery to resolve is: How can this possibly be true when nodejs can be used in a shebang, but doesn't support the # comment character?

Well, the answer is a bit disappointing, and, indeed, bad news for Nix users like you and I: they special case it.

https://github.com/nodejs/node/blob/4dc10ac7d7ddd2cc52e84d1394f7e863d576109f/lib/internal/modules/cjs/helpers.js#L64-L96

In other words, I'm pretty sure there is nothing we can do to solve this.

(edited to remove an Actually.)

nicknovitski commented 5 years ago

Sure, but as far as the system is concerned, isn't the interpreter nix-shell? Why can't it choose what gets passed to the "real" interpreter?

And doesn't it currently do some weird special-case things for ruby and perl?

grahamc commented 5 years ago

The interpreter is nix-shell, you're right. And, it does indeed special-case ruby and perl... but neither of those include modifying the file being interpreted by ruby or perl. Nix just goes ahead and re-exec's the intended interpreter + the file with the shebang.

In order for Nix to strip the shebang, it would need to either alter the file on disk (not really a viable option) or duplicate the file to a new temporary file, minus the #!nix-shell lines. This second option isn't viable, either, because many scripts depend upon the their own name or location on disk.

nicknovitski commented 5 years ago

Ah, that last sentence, I hadn't really thought about that. Many scripts, perhaps, but many nix-shell scripts?...but no, either way, how could we universally make the change backwards compatible for all languages? I'm not even sure if it's possible in any.

Multiline shebangs aren't common, but are they outright non-standard? Would it be completely crazy to try to get a change merged to node to support them? If it's going to be a special case in someone's codebase it might as while be there.

grahamc commented 5 years ago

Yeah, basically multi-line shebangs are completely a nix invention. Maybe we could convince them to add them, though? Not sure :)

nicknovitski commented 5 years ago

I can't think of a way to address the problem with changes to nix, so I'll close this issue. Thank you for giving an explanation not only to me, but to the next person who tries to open a similar one.

Shados commented 5 years ago

@nicknovitski would it not be possible to change nix-shell to accept alternative second-line comment types, and use whatever comment type nodejs will accept on the second line? (// I guess?)

copumpkin commented 5 years ago

@grahamc @nicknovitski how about a --exec-temporary-stripped-copy flag to nix-shell that lets people opt into the copied semantics? If I'm writing the nix-shell shebang'd script and pass in that option, I should know that my filename won't stay the same. Basically, Nix can't assume that semantics won't change, but it can assume that if the author of the script says it's fine, it's indeed fine.

copumpkin commented 5 years ago

Or as @Shados says (sorry, I misunderstood earlier if you saw an earlier comment of mine), we could just expand what Nix accepts on the second line. To be as friendly as possible, if nix-shell is called as the interpreter, we could just strip \W+ from the beginning of the next line. That would cover all comment characters that I know of.

grahamc commented 5 years ago

You mean like \W.+? \W+ would strip the #! but then leave nix-shell for the interpreter to choke on. Unfortunately // is out because it is a Nix operator. Invalid Nix, but Nix none the less. \W is out because <?php for PHP, and { for bash scripts. The truth is, we can't know that the second lines' #!nix-shell is not semantically part of the script.

PS: nix-shell supports #!nix-shell lines anywhere in the file. For example, this runs bash and prints hello world to the screen:

#!/usr/bin/env nix-shell
#!nix-shell -i bash -p hello
print "hello"
print "world"

This one is deceptive, it runs python and prints hello world to the screen:

#!/usr/bin/env nix-shell
#!nix-shell -i bash -p hello
print "hello"
#!nix-shell -p python2 -i python2
print "world"

I only just put two-and-two together, but check this out:

$ cat test.sh 
#! /usr/bin/env nix-shell
/*
#! nix-shell -i node -p nodejs
*/
console.log("oh yes")

$ ./test.sh
oh yes
grahamc commented 5 years ago

Note: In the past I considered this behavior a bug, but seeing that it could be used to solve this problem, maybe it should be considered a feature?

grahamc commented 5 years ago

Another follow-up since @shlevy is laughing at me :) the reason I said we might should consider it a feature, is we should write a test for the behavior and ensure it continues working. Until now I considered it to mostly just accidentally work. If we're going to say, "the way to use a nix-shell line with nodejs is wrap it in a comment" then we need to make sure that doesn't change later.

Ninlives commented 5 years ago

@grahamc This is really interesting, I hope this will be considered as a feature and be maintained :smile:

grahamc commented 5 years ago

I was fighting this problem with Erlang. Solved with:

#!/usr/bin/env nix-shell

-define(NOT_MY_SHEBANG, <<"
#!nix-shell -i escript -p erlang
">>).
nicknovitski commented 5 years ago

😮

$ cat nix-shell-node
#! /usr/bin/env nix-shell

`
#! nix-shell --packages nodejs-10_x -i node
`

console.log('oh my god')
$ nix-shell-node
oh my god

I assume this would work with any language with multiline strings, heredocs, etc.

nixos-discourse commented 3 years ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/shebang-and-nix-path/12498/1

kolpav commented 1 year ago

You mean like \W.+? \W+ would strip the #! but then leave nix-shell for the interpreter to choke on. Unfortunately // is out because it is a Nix operator. Invalid Nix, but Nix none the less. \W is out because <?php for PHP, and { for bash scripts. The truth is, we can't know that the second lines' #!nix-shell is not semantically part of the script.

PS: nix-shell supports #!nix-shell lines anywhere in the file. For example, this runs bash and prints hello world to the screen:

#!/usr/bin/env nix-shell
#!nix-shell -i bash -p hello
print "hello"
print "world"

This one is deceptive, it runs python and prints hello world to the screen:

#!/usr/bin/env nix-shell
#!nix-shell -i bash -p hello
print "hello"
#!nix-shell -p python2 -i python2
print "world"

I only just put two-and-two together, but check this out:

$ cat test.sh 
#! /usr/bin/env nix-shell
/*
#! nix-shell -i node -p nodejs
*/
console.log("oh yes")

$ ./test.sh
oh yes

Might have worked with older node versions but current behavior is

~ cat test.sh
───────┬─────────────────────────────────────────────────────────────────────
       │ File: test.sh
───────┼─────────────────────────────────────────────────────────────────────
   1   │ #! /usr/bin/env nix-shell
   2   │ /*
   3   │ #! nix-shell -i node -p nodejs
   4   │ */
   5   │ console.log("oh yes")
   6   │
   7   │
───────┴─────────────────────────────────────────────────────────────────────
~ ./test.sh
node:internal/errors:484
    ErrorCaptureStackTrace(err);
    ^

TypeError [ERR_UNKNOWN_FILE_EXTENSION]: Unknown file extension ".sh" for /home/kolpav/test.sh
    at new NodeError (node:internal/errors:393:5)
    at Object.getFileProtocolModuleFormat [as file:] (node:internal/modules/esm/get_format:79:11)
    at defaultGetFormat (node:internal/modules/esm/get_format:121:38)
    at defaultLoad (node:internal/modules/esm/load:81:20)
    at nextLoad (node:internal/modules/esm/loader:163:28)
    at ESMLoader.load (node:internal/modules/esm/loader:605:26)
    at ESMLoader.moduleProvider (node:internal/modules/esm/loader:457:22)
    at new ModuleJob (node:internal/modules/esm/module_job:63:26)
    at #createModuleJob (node:internal/modules/esm/loader:480:17)
    at ESMLoader.getModuleJob (node:internal/modules/esm/loader:434:34) {
  code: 'ERR_UNKNOWN_FILE_EXTENSION'
}

Node.js v18.12.1

Renaming to .js fixes the problem but I would really like to keep the .sh extension but I guess node started caring about file extensions or something since then.

EDIT: Its even better. Having this package.json with "type": "module" inside same dir breaks it :upside_down_face:

{
  "name": "foo",
  "type": "module"
}
thiagokokada commented 3 months ago

Got a similar issue with PHP and declare(strict_types=1) (that apparently needs to be the first line inside a PHP script). Workaround:

#!/usr/bin/env nix-shell
<?php
declare(strict_types=1);
#!nix-shell -i php -p php83
AndersonTorres commented 3 months ago

I believe I need something similar for Ruby!

lithp commented 2 months ago

@AndersonTorres ruby recognizes # for comments so it should work just fine:

$ cat ruby-polyglot.rb 
#!/usr/bin/env nix-shell
#! nix-shell -i ruby -p ruby

puts "Hello, world!"
$ ./ruby-polyglot.rb 
Hello, world!
lithp commented 2 months ago

Before finding this thread and kind of crazy workaround I found my own solution to this problem, which might be helpful to anyone using a language which does not have some form of multiline comment and which supports reading code from stdin

$ cat node-polyglot.js 
#!/usr/bin/env nix-shell
#! nix-shell -i bash -p nodejs

# some schenanigans,
#   nix-shell requires the second #! line
#   but it confuses nodejs

awk '/# JS-START/{flag=1;next} flag' "$0" | node - && exit

# JS-START

let fact = n => {
    let result = 1
    while(n > 1) {
        result *= n
        n--
    }
    return result
}

console.log(fact(5));
$ ./node-polyglot.js 
120