Configure with env variables #350

Closed tboerger closed 3 years ago

To make the configuration easier especially in the world of systemd and docker we should make it possible to entirely configure gitea with environment variables. To work around the required variables used within the ssh shell we can generate an environment file for it automatically on application start. That way we can get rid of the requirement of the custom app.ini.

First is that we can indicate one configuration file via command line?

The configuration file can already be defined at least via an environment variable. But I would like to provide flags for the available options and also bind them to environment variables so that we get the full flexibility out of that.

This will be a bitch, we'd have to build a new module on top of the current ini-loader 😒

At least I will give it a try and then we will see how complex that gets.

There are Go packages (I thought envflags is an example) that automatically load configs, in order of increasing priority, a config file, env flags and command line flags. Config variables only have to be defined once and they can be set in all three ways.

We already use codegangsta cli which provides most scenarios

And additionally to that I can think of dotenv.

any update?

I think setting it to 1.2.0 is fine so far unless somebody provides a PR in time.

Referring to @bkcsoft's comment:

This will be a bitch, we'd have to build a new module on top of the current ini-loader :unamused:

The common approach to avoid code changes (as this is just a matter of deployment) is to use tools that generate the config file from the environment variables and then load the actual application.

Within the scope of Docker, an application called dockerize has become quite popular for these tasks. Its workflow first loads a configuration file template (which could even be generated from the default .ini file easily with some shell or other scripting magic). Then, it looks for environment variables, and inserts them into the template. If there's a default value given in the template, and an environment variable is not present, this value is used.

This way, there's no code changes required, just add dockerize and the template to the image, and change the entrypoint to dockerize (it should be okay if the config file is rendered even if the CMD is modified on startup).

I saw Gitea is using the alpine base image. This makes installation as easy as:

apk add --update-cache --repository http://dl-3.alpinelinux.org/alpine/edge/testing/ dockerize

(This needs to be updated as soon as the dockerize package is moved to the main repository, but that's the only issue).

I can spend some time on integrating dockerize into gitea. @lunny @tboerger will you accept a PR on this?

@ivlis Please follow CONTRIBUTING and send PR.

@lunny broken link.

https://github.com/go-gitea/gitea/blob/master/CONTRIBUTING.md

@lunny @TheAssassin alright, thanks, will do.

Any news on this? We can't seem to find how to enable Let's Encrypt when deploying via Docker to GKE, as there's no ENV var for it.

@skddc letsencrypt support hasn't been released in a stable version of gitea. It is in 1.6.0-rc1, but no earlier versions.

We're fine with using a release candidate. But does that mean it's a missing feature? Is there an issue for making it configurable?

So far most config variables are only configurable via the configuration file.

Hence my original question if there are any news on this particular issue/feature. So the fact that LE support is not in a stable version yet has nothing to do with some variables not being possible to configure in Docker via ENV vars right now, correct?

When you are deploying to Kubernetes you can't configure all by env variables, I think the best would be a configmap for the app.ini. Just disable the install lock. The first user that gets registered will be automatically an admin user.

Cool, that was exactly my idea. Will do that then. Thank you!

You also got the option of an init container which starts gitea initially to get the database migrated, and you could execute the gitea cli once to create an admin user. Never tried that, but should work with some basic bash scripting.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

Seriously? Why are people using these bots...?

I'm not sure that confrontational ranting is usually the beginning of a constructive conversation. If you dislike something about this repo or project, especially when you have a strong opinion about it, then why don't you open an issue and calmly explain your point? It might lead to a positive change...

@skddc my reaction was not meant to be constructive, see it as an expression of my opinion in a sort of rude form. Maybe not the best way.

It's just highly annoying that some projects start to use these kinds of third-party services on GitHub, which spam you with mails and notifications. Closing old issues doesn't solve anything in terms of overview in my opinion. Phabricator always says "you can't ignore problems forever" or so.

It's not only third-party projects who've used these bots, also some which I am involved in, so I've got some experience. I know having a thousand open issues can be a challenge to manage for smaller projects. GitHub's interface isn't making this much easier. I can really understand this. But is it really the solution to close old issues? I don't think so.

So, to be a bit more constructive: please turn off this service, @go-gitea.

This bot has helped close issues, where users report bugs, and don't follow up. It's either the bot does this maintenance task of pinging inactive issues, or a maintainer. As maintainers would prefer to spend their time building Gitea intstead of pinging many old issues to see if they are still active, this was the solution. There is a two week timeframe that is given as a warning. It has also helped some people to remind them that PRs are still open and are awaiting feedback.

That being said, this isn't the correct place to discuss this decision. Please feel free to continue this discussion in the forum

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

Ping.

Any update on this? Using a config map did not work out for me since the file will be read only and gitea can not start when trying to overwrite the provided file.

You could use my docker image at webhippie/gitea which provides env variables for every config option :)

Wow nice work @tboerger I've just started trying to build a kubernetes install for gitea and found this doesn't work out like I expect, so you saved my a lot of effort.

Is there a pull request for this? I'd really like to see this in the mainline tree eventually?

@tboerger that's a crazy amount of work. Does Docker really require us to provide this level of configurability through environment variables alone? Is there really not a better way?!

If we really have to provide this - I don't think we can do this through gitea web simply because we will want to use the same settings code for web and all the other commands. (although now gitea hook and gitea serv are almost shims we could consider reducing their required configuration considerably.)

So, how much logic do you put in to these environment variable settings? If it's simply a 1-1 mapping from environment variable to ini value can we use something completely generic? E.g. a program that reads an ini file then, maps $PREFIX_SECTION_KEY to

[Section]
key=${PREFIX_SECTION_KEY}

then writes the app.ini back out

If we need some more logic than that we could catch the bits that need to do something more.

This could be a new command gitea docker-environment-config -i base-app.ini -o app.ini?

From what I understand, config files are great when you want to make a static config. But docker is designed to work from environment variables. Then you can take the same image, but give it different environment vars and it'll run with a different config depending on your environment. Take for example the idea to run a container in two regions, with separate databases. You'd just run the same image with different database connections in the env and you're good.

I think the best approach, which covers the existing codebase and allows for easy extension into containers might be one of those libraries which abstracts away config and the tools use this to acquire config instead of accessing it directly.

So if config files are available, read them. But if a variable exists in the env, that gets to override them. But if there is a runtime command line option, that overrides everything.

Then you can support all use-cases.

for example, using this might help cause you can set config sources and fall back each time you don't find what you want

https://micro.mu/docs/go-config.html

So first read command line, no? ok env, no? ok config, no? shit! defaults

Then leave the creation of the env vars to me, I can specify any of them I don't want to be default, or I write out a config manually, or I can just run from the command line. Seems reasonable.

But yeah, docker runs by using env vars, you can write an entrypoint which outputs a config if you want, but it's recommended that you use env vars

I'm actually looking at the s6 files for starting gitea and I'm wondering why you only let me set some variables and not just let me set every variable in that file? Some values don't make sense maybe? But it's quite a short list of possible settings and the number of settings I can't set is quite long.

Was it just that nobody wanted to copy and paste all the variables here and apply them as defaults? Or is there some reason why certain variables can't be changed in the container?

I think we still need to write out these env variables to a config file at entry point.

My reason is the SSH hooks and serv need to be able to find the Gitea server to speak to. Therefore if we don't write these out we have to push the entire environment across to SSH processes and into the .SSH/authorized_keys. Further we risk accidental environment overrides in other hooks and possibly malicious overrides by commands. However, as I say it might be possible to massively reduce the config needs for serv and hook as they're now thin shims on webserver calls.

The other problem is that afaiu the docker environment you run in isn't passed to the docker environment you exec. So say you want to use a Gitea subcommand in exec you won't be able to use it without copying the entire environment.

Regarding making the docker app.ini file completely settable - I can understand why we haven't done it. It looks very much not fun to do and awkward and fragile to keep updated. I'd far rather do the generic gitea docker-environment-config thing mentioned above. That at least would mean that I never need to change it unless I wanted to add a special non-1-1 mapping.

The other problem is that afaiu the docker environment you run in isn't passed to the docker environment you exec. So say you want to use a Gitea subcommand in exec you won't be able to use it without copying the entire environment.

Just to address this. If I run a docker container with a set bunch of environment variables. If I exec into that container to run another command. The same environment variables I set when I ran the container are available to the command when it's executed inside the same container.

Or did I misunderstand you?

Regarding:

into the .SSH/authorized_keys.

I don't think environment variables are a good way to inject this information into the container. You can mount a docker volume containing the keys, or in kubernetes write a configmap and mount that as a file inside the container if you want to do this.

So in fact, pushing this through environment variables, in my opinion, is the wrong thing to do. It's a file, and I think it should be treated like one.

You're misunderstanding the problem in .SSH/authorized_keys.

We have to run gitea serv for each connection.

Therefore we need to be able to configure gitea for each connection.

If you do not save the configuration to a file you have to set that configuration somehow.

Then within each SSH session git will call Gitea hook. That has to be configured too.

No no, I'm saying that you should save it to a file. It's just that how it gets into that file shouldn't be through an environment variable, but a normal file mount, or a kubernetes mechanism of various type.

Or are we talking about different things here?

Yes, you're missing the problem. If you allow arbitrary override of configuration by environment variable without saving it to a config file - how do you configure gitea serv and gitea hook. Even with the changes in #6993 there is still a requirement to configure these calls - but we also don't want their configuration to necessarily be overridden by environment calls.

@zeripath why can't those commands just read their configuration from the environment as well? Should be possible to have one central config component that all commands share and use to e.g., read config files or alternatively read config values from the environment. The environment is available for every process in the container, after all.

If using environment variables is too awkward and you prefer to use config files, a config file generator that runs before the main processes is the way to go, this concept is also widely spread. See https://github.com/go-gitea/gitea/issues/350#issuecomment-326326570.

An alternative would be to do the heavy lifting in the main gitea process, and have the hook stuff use RPC instead of doing anything on its own. Then, some simple auto-discovery of the main process (e.g., using a Unix socket in a well-known location) is all you need in these hooks, no configuration needs to be forwarded.

They're not being overridden arbitrarily. I'm setting environment data when I run the container. So I want my configuration used, not what is written in any config file. Thats why you load config data in the precedence that you do. Command line overrides anything in the env, the env overrides anything in the file.

I want those commands using the config I set. It's literally the point of setting the data in the first place.

I think if you're executing files inside a container they all inherit the same runtime environment as their parents, no? Or at least they should.

If each command used the same config loader, which was configured to read from command line > env > config file. Then every command would automatically be configured with the correct configuration data without ever having to change any of those commands.

The problem is that gitea right now only loads from the config file. Why not replace the config file loader with a service like I mentioned in an above comment and then you'd have no more problems, you'd naturally obtain the right configuration every time, regardless of what subprocess you're using

I think if you're executing files inside a container they all inherit the same runtime environment as their parents, no? Or at least they should.

Exactly. Unless a process doesn't forward/override those when spawning new subprocesses. I am not sure whether SSH servers will forward the environment they're called with (normally, they just spawn a clean login shell per session, right?), so there you might need a config file.

Well ok, but even in the case of the SSH server. That should use data mounted into the filesystem in terms of the files inside the .ssh directory. So that shouldn't even matter. Right? Given your last sentence. Seems we agree.

@christhomas not really, the processes spawned by the SSH server are gitea tasks (git pre-/post-push hooks for instance, that's what @zeripath means with gitea serve etc.), these also need the configuration values. If the SSH server doesn't forward those information, then they can't do their job.

Workflow is:

git push/pull/...
connects to Gitea's SSH server
remote git runs hooks in the repository (gitea serve etc.)

Now, obviously, these gitea ... commands need to have the same information (read: configuration data) as the main process, or there will be inconsistencies. An environment variable based system relies on an information flow through the SSH server process that OpenSSH spawns for that connection, i.e., OpenSSH must pass through system environment variables (the ones the main server process sees) to the shell in the per-connection process. That's unlikely to happen with OpenSSH, though, for security reasons. SSH is, after all, primarily designed for secure, remote shells, not for usage as a transport protocol for e.g., Git, ...

TL;DR: the Git hooks will most likely not see any of the env vars set for the Docker container. But they need the same config as the main server process.

I already illustrated a few options how to work around that limitation, e.g., using RPC and performing the tasks in the main processes (i.e., you only need some auto-discovery to find the main Gitea process, and need to set up permissions correctly so only SSH induced stuff can access these endpoints), or writing a config file from the environment variables and allow those processes to read that as well.

ahhh, thats what he meant, I misunderstood and didn't think this was an issue for the ssh daemon

go-gitea / gitea

Configure with env variables #350