Open gitgithan opened 3 days ago
I think this is an overcomplication of things.
This is a non production setup, so the easiest would be to just keep the
If we want to highlight that example.env
file there..env
files should not be exposed in github etc, we should keep only the .env
file in the docker-compose.yaml
.
The Makefile
will need to be updated too, to adapt to this change.
@gerrykou Thanks for raising the github exposure point. That clarifies that there are 2 separate questions.
Question 1
Both env files do not need to be there, in fact even the env_file
key can be deleted too (I tested this) because .env
is automatically searched for. (my suggestion in last paragraph of 1st message could be improved)
A container's environment can also be set using .env files along with the env_file attribute.
This language implies that env_file
is only needed if extra env files not called .env
are used, otherwise, docker can automatically search .env
placed at the root of the project directory next to the docker-compose.yml.
One argument for env_file
section being there is for it to be self-documenting, like how EXPOSE doesn't actually bind ports (-p 8080:80
does the job) but documents what is exposed. However, I don't see any examples on docker docs listing .env
under env_file
section.
Thus, I would prefer the entire env_file
section to be removed, and let docker automatically search .env
. (PR https://github.com/DataExpert-io/data-engineer-handbook/pull/150 still has env_file section).
This teaches students that "automatic discovery" is a thing, which helps them debug issues with pytest automatically discovering test folders by observing locations of __init__.py
in future.
Question 2
In real work .env
would not be on github, and example.env
would be on github but with values removed. It was good the README required students to create .env
from example.env
to express the concept that example.env
is a template, and I now realize it's just easier for bootcamp scaling/self-service purposes to prepopulate the values of example.env
already.
Further improvements possible but not applicable to bootcamp
This bootcamp has only 1 env file (excluding template). If there are ever 2 env files (excluding template), then further specifications on optionality can be done: https://docs.docker.com/compose/how-tos/environment-variables/set-environment-variables/#additional-information-1
Would love to learn about how teachers can set things up correctly (demonstrating not just the end state, but real workflows) when teaching infrastructure topics, and similar problems faced.
Since this bootcamp aims to train engineers who can get jobs, and interviews would have rounds of code review, a student may fail in the operational details when asked to explain code, even if he gets all the concepts in the bootcamp, thus the emphasis on having a reason for every line of code.
.env
file in the docker compose file for self-documenting.example.env
file should be committed to git for documentation too, having dummy-placeholder or development values only.So my suggestion on PR #150 is how i have done it and i would do it in my workflow.
This issue is both a suggestion and contains questions.
I was wondering why there were 2 items under
env_file
. I was expectingexample.env
to be a template to be copied and tweaked, and not actually used by a running docker. Another issue is havingexample.env
makes users think docker-compose can interpolatedocker-compose.yml
, but my experiments show otherwise.I investigated how precedence works, and what happens if multiple
env_file
contain the same variables. I found https://github.com/docker/compose/blob/v1/docs/Compose%20file%20reference%20(legacy)/version-3.md (this is version-3 readme, not sure what version ourdocker-compose.yml
is) sayingThis description does not match the behavior in our setup. (see 2nd photo)
I checked that
env | sort
locally does not containHOST_PORT
to make the experiment accurate.In 1st photo, I defined
HOST_PORT
only inexample.env
. I expected docker-compose to pick it up, but you can see the terminal output of 1st run sayingHOST_PORT
is not set and a random port 4041 was assigned locally.I had to run a 2nd command with additional
--env-file example.env
for it to read theHOST_POST
fromexample.env
, so this goes against the docs above.Also i'm confused why
docker-compose config
shows that without--env-file example.env
it actually recognizes 5434 inexample.env
. I thoughtenv_file
section in docker-compose.yml and--env-file
on cli are the same.If i uncomment
HOST_PORT
in.env
to make variables from multiple env files clash:Notice config output stays at 5434 from
example.env
but the real port started is on 5433 following.env
. Also this command was started without--env-file example.env
.Summary of what's confusing.
docker-compose config
anddocker ps
showing different ports (2nd photo).docker-compose config
matches docs saying later file in list ofenv_file
is read.HOST_PORT
variable not set warning when it's already set inexample.env
, but can run properly when adding--env-file example.env
on CLI (photo 1).Related issue may be whether variables defined in env files can be interpolated at docker-compose.yml parsing time, or only later during container runtime. I can't find explicit mention whether files under
env_file
will be interpolated into docker-compose.yml. According to https://docs.docker.com/compose/how-tos/environment-variables/variable-interpolation/#ways-to-set-variables-with-interpolation, it works only if the variable is in.env
, andenv_file
in yml is completely ignored. https://www.reddit.com/r/docker/comments/17ekhqy/comment/k658k18/ PaintDrinkingPete here also says the same.In this case we want variables for both
docker-compose.yml
interpolation, and for container runtime variables (those underenvironment
key).example.env
is not helping withdocker-compose.yml
interpolation, thus suggest removing it fromenv_file
section