Should argv include runtime parts (binary/script/runtime args)?

CanadaHonk commented 8 months ago

argv is traditionally all arguments, which would include: runtime binary, script being ran, and runtime arguments ("runtime parts"). Should ours?

Existing examples:

process.argv (Node): includes runtime parts
Deno.args (Deno): does not include runtime parts (they can be got separately with Deno.execPath(), etc)

lucacasonato commented 7 months ago

I would argue against including "runtime args" (args interpreted by the runtime) in the args exposed to users. My reasoning is that arg parsers are not portable across runtimes otherwise, because invocation may happen in different ways across runtimes. For example, in Node you exec with node <filename> where as in Deno you use deno run <filename>. If the arg list exposes the raw [deno, run, <filename>] and [node, <filename>] libraries need to be aware that they need to start reading at the third arg if arg[0] == deno, and at the second arg for node.

This is significantly complicated for various reasons:

the first arg may not be called deno or node. It may be called deno.exe, or node-canary, or whatever really
because both node and Deno permit flags in between the deno run / node and <filename>, tools need to know to ignore anything after node / deno run that starts with a -- until you've reached the file path. This is then further complicated for args that do not require = to associate values (such as deno run --seed 123 <filename>). As a tool you'd need to be aware of all flags of deno run, their associativity rules, and reimplement this in your library. This is infeasible to expect from users, and also puts an undue burden on runtimes to not introduce new flags that don't require = to associate values.

I think exposed args should be an array of all string arguments intended to be passed to the user code. Runtimes should themselves define what "intended to be passed to the user code" means. In Deno and Node this would mean all flags after <filename>. We can add some examples to the spec to explain intention here.

CanadaHonk commented 7 months ago

I agree, although it could possibly also be useful to expose the raw argv or binary path or runtime args somewhere too, just not as the main args API, eg checking if a runtime's flag was used or not. However, that could be left to each runtime to do how it wants(/already is?) without being standardized here.

lucacasonato commented 7 months ago

Yeah - I think exposing arg information passed to runtimes is better done in runtime specific APIs that present this as structured data. Otherwise you run into the arg parsing problem anyway

bakkot commented 7 months ago

Some prior discussion in the repo which eventually became node's util.parseArgs.

Fun fact: util.parseArgs actually does the "strip runtime parts" thing internally, since you don't actually want to parse those things in an arg parser. It just never ended up getting exposed to users.

styfle commented 7 months ago

I agree, although it could possibly also be useful to expose the raw argv or binary path

I think that the args API can be separate from the binary path API. For example, see process.execPath which is common when forking the process.

CanadaHonk commented 7 months ago

I agree, I put in the explainer/readme but forgot to put here. It seems best to leave aspects like that intentionally not standardized and allow implementers to do so in their own APIs as they see fit.

paperdave commented 7 months ago

i think it was a silly choice to make process.argv contain the binary path to the runtime as well as the script path (it isn't always guaranteed to be set in some extreme edge cases).

i dont think there is any use case in having a standard way to expose runtime args (aka process.execArgv) because these are platform dependant, and branching on that is runtime-dependant code (the flags to a runtime should not be standardized)

it would be nice to have a way to get the name of the script being run (think help messages that explain a usage like "${arg0} [options]"). import.meta.url is standardized already. taking the basename of the url works in most situations, but one edge case here is if you have a file named a that is a symlink to b.js and contains a shebang like #!/usr/bin/env node, you will want to see "a", but import.meta.url will probably print out the path to b.js.

i dont know if this situation actually has a good solution right now, i notice bun build --compile sets process.argv0 to the desired value but obviously this doesnt work outside of the bundled binary (point's to bun/node) or with shebangs.

tmikov commented 6 months ago

The intuitive approach is to store the script name in argument zero, similar to Python. It follows the existing conventions. For use cases that don't need it, it is easy to ignore.

wintercg / proposal-cli-api

Should argv include runtime parts (binary/script/runtime args)? #3