balena-io-modules / balena-compose-parse

Parse docker-compose files into a general, usable and fully typed object
Apache License 2.0
6 stars 0 forks source link

Validation allows badly named environment vars #61

Open pipex opened 2 years ago

pipex commented 2 years ago

The schema defines the environment entry as list_or_dict, but it does not set any restrictions on the environment keys, which should only be set to /^[a-zA-Z_][a-zA-Z0-9_]*$/. This allows users to push composition to the cloud that will get rejected by the supervisor (silently before 12.11.15).

Should we also add that validation here?

PS: probably the same thing is true for labels

pipex commented 2 years ago

ping @pdcastro @robertgzr, what do you think about this?

jellyfish-bot commented 2 years ago

[pipex] This issue has attached support thread https://jel.ly.fish/c996cc34-31e7-4411-a09d-ec15498061a8

pdcastro commented 2 years ago

which should only be set to /^[a-zA-Z_][a-zA-Z0-9_]*$/

Why does the supervisor require that character set? What is the authoritative definition of "badly named"? :-) Does the "standard docker-compose tool" define a schema that restricts the character set?

In the past, I came across people stating that environment variable names cannot contain certain characters simply because some shells, like bash or Windows PowerShell or cmd.exe, impose their own restrictions, but that's just a limitation of those shells. If I recall correctly, processes running on an operating system (including docker containers) can set environment variable names with pretty much any characters other than ASCII zero/null (null-terminated C string), and pass those env vars around to subprocesses, without involvement of a shell. This also holds true to Docker containers. In the past I tested (can't find the examples now) that it was possible to pass environment variables to an app in a docker container with names that were not supported by bash, by not using bash.

If the supervisor enforces a certain character set, consider it may be a supervisor bug (insufficiently justified limitation?). For example, the set you mentioned (/^[a-zA-Z_][a-zA-Z0-9_]*$/) does not include the dot ('.') character, and Cameron (previous supervisor maintainer) in the past agreed that the supervisor should accept the '.' character, which was needed by some user applications:

services:
  gatewaybridge:
    image: arribada/lora-gateway-bridge
    environment:
      - INTEGRATION.MQTT.AUTH.GENERIC.SERVER=tcp://mosquitto:1883
...

The last issue was closed for unrelated reasons, but the dot character should certainly be allowed in environment variable names.

Having said all that:

pipex commented 2 years ago

Great points as always @pdcastro,

As far as I could find and from direct testing, there are no restrictions on docker-compose or docker for creating any asciii environment variables. I did find this information linking to an Open Group specification pointing out the following

Environment variable names used by the utilities in the Shell and Utilities volume of IEEE Std 1003.1-2001 consist solely of uppercase letters, digits, and the '_' (underscore) from the characters defined in Portable Character Set and do not begin with a digit. Other characters may be permitted by an implementation; applications shall tolerate the presence of such names.

Which is where I think the supervisor restriction comes from. From testing I can see this can cause problems on shell scripts but not on other languages

$ docker run --rm -e "A.B=hello" -ti alpine sh
/ # env
HOSTNAME=6c79ffa000b6
SHLVL=1
HOME=/root
A.B=hello
TERM=xterm
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
/ # echo $A.B
.B
/ # echo ${A.B}
sh: syntax error: bad substitution
/ #
$ docker run --rm -e "A.B=hello" -ti balenalib/amd64-alpine-node node
Welcome to Node.js v17.0.1.
Type ".help" for more information.
> console.log(process.env['A.B'])
hello

I would be OK with removing the restriction as I think it lines up with the "enablement over control" ethos. My only concern is that we start to get comments on support with respect to the sh: syntax error: bad substitution message, but I suppose it they should be easy to solve.

Regarding docker labels, that I didn't mention before, but that are also syntax checked by the supervisor, it seems that there are some recommendations from Docker but are not enforced and I don't see much reason to enforce them in the supervisor either.

What do you think @cywang117 @20k-ultra should we remove the environment variable syntax checks on the supervisor?

I'll bring up this discussion on FD too

pipex commented 2 years ago

Fun fact, looks like even unicode key names are ok by docker

docker run --rm -e "😅=hello" -ti alpine sh
/ # env
😅=hello
HOSTNAME=51232640ff91
SHLVL=1
HOME=/root
TERM=xterm
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
cywang117 commented 2 years ago

@pipex I think it makes sense that if Docker and docker-compose allow it, the Supervisor shouldn't restrict it to minimize user confusion around env vars as they relate to containers.

pdcastro commented 2 years ago

should we remove the environment variable syntax checks on the supervisor?

I suggest spending a bit of time digging through the git history to find out the earliest commits and pull requests where validation was introduced. There may not be matching github issues, but there'd be PRs and commit messages. My memory may betray me, but I seem to recall character validation being introduced in the supervisor for a case where the user would set a funny variable name that perhaps the balena API would refuse to save in the database, causing the supervisor to get in serious trouble (like a restart loop, poor error handling at the time) that required support agents or fleet ops to intervene. Chances are that, nowadays, the supervisor would error more gracefully in similar conditions, but it should be tested.

By the way, if we are going there (and yes I think we should), it should probably be a joint team effort to investigate and test multiple components and centralise a source of truth of some sort for schemas and validation (maybe this repo, or maybe a new npm module, or even the balena SDK?! - although the API would need to import it too, so probably not the SDK) Components such as:

I suspect each of these components will have some hardcoded validations based on assumptions such as what bash supports or some post that the developer found in StackOverflow, then separately amended over time based on bug reports against that component -- e.g. a bug report against the web dashboard causes the web dashboard to update its own hardcoded rules; a bug report against the supervisor causes the supervisor to update its own hardcoded rules and so on, causing components to get out of sync even if the starting point was similar.

pipex commented 2 years ago

I did actually. That check seems to be as old as the supervisor state engine and there doesn't seem to be a specific reference to the validation :(

https://github.com/balena-os/balena-supervisor/pull/482

As any misconfiguration will cause the supervisor to get into a loop, I think having a way to propagate the engine output to the dashboard might be a better solution than trying to catch specific errors early, at least from the supervisor side.

You are right that the cause of the check may be related to the API, I'll move this conversation to FD