Environment Variables - Githubissues

Tishj commented 1 year ago

Environment Variables come in two/three flavours:

exported variables, these become part of the environ and are thus forwarded to executables.
local variables, these are set only in the shell's environment, and are thus not forwarded to executables.
transient variables (not an official term) is a variation of local, only provided to the executable of the command it's set in.

The standard covers it here: https://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_05

I think the transient variables are just a variation of local, because the variable assignments are processed as the cmd_prefix they get executed in the same environment as the executable is.

This might be the section of the standard that covers the behavior of transient variables:

Variables with the export attribute, along with those explicitly exported for the duration of the command, shall be passed to the utility environment variables

export creates an exported variable. set creates a local variable key=val creates a local variable key=val my_executable creates a transient variable unset removes both local and exported variables

Tishj commented 1 year ago

Relevant pages: export set unset

mraasvel commented 1 year ago

The command execution environment describes what is passed to the command's environment. It differs for commands that are not builtins or functions. https://www.gnu.org/software/bash/manual/bash.html#Command-Execution-Environment

The behavior of transient variables is described in https://www.gnu.org/software/bash/manual/bash.html#Environment

In essence, variables exported to commands are only variables with the exported attribute, if a transient variable is present, it is only exported (for the duration of that command) when the command is NOT a builtin or function.

mraasvel commented 1 year ago

I think the easiest way to implement it is if the shell environment containing exported variables and local variables are one and the same. If not, that makes lookup for expansion (and any other variable lookup) much more complex than it needs to be. There simply needs to be metadata on the environment storing the attributes (i.e. export), when a command is executed it simply requests all environment variables that have the export attribute.

Transient variables can be added to the environment after forking, since this won't affect the environment of the parent process.

Tishj commented 1 year ago

Yea I agree The Environment interface should probably have an export method, which acts just like the built-in by the same name create the variable if it doesn't exist, then mark it as exported. Otherwise just mark it as exported

Tishj commented 1 year ago

In any case the very thin wrapper on top of the environ we have inside of the path-resolution branch is not enough

But my CoW fully managed implementation is possibly overkill, so we might want to find a compromise between the two

Using a unordered_map<string, VariableData> is probably a good idea. In the VariableData struct we can indicate if this is exported.

I guess that means we still store everything directly in environ, but mark which variables are marked as export. Then when we materialize we loop through our map and use getenv to get the char* to put into the char** we create for the envp of the execve call

K1ngmar commented 1 year ago

If we do this why still use environ, and not just store everything in the unordered_map directly? AFAIK environ is a unix extension and not in the c-standard. To me it would make more sense to just handle envrionment varaibles on our own.

Tishj commented 1 year ago

If we do this why still use environ, and not just store everything in the unordered_map directly? AFAIK environ is a unix extension and not in the c-standard. To me it would make more sense to just handle environment variables on our own.

I've already said my reason in a PM but I'll repeat them here: memory management

If we manage our own variables, we need to allocate for every variable instead of just calling getenv to get the pointer to managed memory.

EDIT: I was wrong 🙃 getenv only returns the value, not the key=value. through some pointer arithmetic we could get the start of the string, but that's kind of undefined behavior I think, so let's not

Tishj commented 1 year ago

But if we go down that route, then I just want to interject, I already wrote this on the shell_environment branch, granted it's a little overkill because it also caches

42-Ikole-Systems / TMK-SH

Environment Variables #31