nat-n / poethepoet

A task runner that works well with poetry.
https://poethepoet.natn.io/
MIT License
1.4k stars 58 forks source link

Doesn't respect environment over .env file #129

Open nitedani opened 1 year ago

nitedani commented 1 year ago

For example:

[tool.poe]
envfile = ".env"
[tool.poe.tasks]
migrate = "alembic -x dbname=$POSTGRES_URL upgrade head"

Issue: If I run poetry run poe migrate then poe will use the POSTGRES_URL set in the .env file, not POSTGRES_URL set in the environment.

Expectation: poetry run poe migrate first tried to find POSTGRES_URL in the environment, and falls back to looking it up in the .env file only if POSTGRES_URL is not set in the environment

nat-n commented 1 year ago

Hi @nitedani, thanks for the feedback.

It works that way because that's what I thought made sense at the time. However I would be interested if you could point to examples of other prominent tools that take the approach you suggest?

I guess it's not quite what you want but defining an argument for the task like so might be a workaround:

[tool.poe.tasks]
migrate = "alembic -x dbname=$POSTGRES_URL upgrade head"
args = [{ name = "POSTGRES_URL", options = ["--url"], default = "${POSTGRES_URL}" }]
poe migrate --url="postgresql://..."
nitedani commented 1 year ago

https://www.npmjs.com/package/dotenv Dotenv works the way I described. "What happens to environment variables that were already set? By default, we will never modify any environment variables that have already been set. In particular, if there is a variable in your .env file which collides with one that already exists in your environment, then that variable will be skipped."

nitedani commented 1 year ago

@nat-n https://docs.docker.com/compose/environment-variables/envvars-precedence/ Docker and Docker compose also works the way I described.

nat-n commented 1 year ago

@nitedani, I think your proposal makes sense. But it does complicate things a bit about how env vars are resolved for a task.

Today there's a clear (if not obvious) resolution order for env vars:

  1. os.environ <- Lowest precedence
  2. tool.poe.envfile
  3. tool.poe.env
  4. tool.poe.tasks.\.envfile
  5. tool.poe.tasks.\.env
  6. tool.poe.tasks.\.uses
  7. tool.poe.tasks.\.args <- Highest precedence

The guiding principle is that the closer the config definition is to the task definition/run the higher the precedence.

Perhaps a workaround would be if the envfile supported some bash variable substitution syntax like MY_VAR=${MY_VAR:=newValue}. This would allow one to specify whether the host env or the envfile should have precedence on a variable by variable basis.

On the other hand I think moving towards the behaviour you expect would imply this alternative resolution order:

  1. tool.poe.envfile
  2. tool.poe.tasks.\.envfile
  3. os.environ
  4. tool.poe.env
  5. tool.poe.tasks.\.env
  6. tool.poe.tasks.\.uses
  7. tool.poe.tasks.\.args

In this structure, anything defined within the pyproject.toml still has precedence of os.environ, but envfiles do not.

Gonna need to think on this. Different perspectives are welcome.

nitedani commented 1 year ago

I think the source of this issue is that the .env file needs to be explicitly set in pyproject.toml. So in this case one can expect that the variables in the explicitly set .env file are higher priority than the shell variables. The other tools I mentioned, the .env file is loaded implicitly, so there is no expectation that the implicitly loaded .env file is higher priority than the explicitly set shell variables. But this was not clear to me, because of how other tools work. I expected the same behavior. For example, docker prioritizes the shell variables over the explicitly set env_file in docker-compose.yml, so it works like your second example.

nat-n commented 1 year ago

@nitedani Thanks for helping to clarify different ways of thinking about the envfile.

The way I think about the role of the envfile is that it's the place for config that doesn't belong in the pyproject.toml because it:

Subsequently if you wish to override values from the envfile either per host then you can always use a higher precedence envfile, or you can define named argument for the task to override a variable per task invocation.

The one scenario I can think of off the top of my head where this doesn't work well is if you want to pass any extra cli arguments to the underlying command, and so don't want to define named arguments on the task... addressing #69 would help with that.

So to conclude, I think the docs need clarifying, adding variable templating support to the envfile parsing and addressing #69 should improve the situation, but I'm not yet convinced that it makes sense to switch to the alternate resolution order I described above.

I'll leave this issue open for now in case someone will articulate another perspective on this.