Open pamelafox opened 1 month ago
^^ Added related issues/PRs to the description.
@vhvb1989 might be a good use-case for Named/Specialized hooks feature?
IIRC, at the top of the wish-list is to use the python app as the starting point.
For example, we currently support running hooks where azd automatically set all the .env values as env vars for the script defined in the hook, but this requires folks to use azd as the starting point. Either by running the azd command that triggers the hook, or by running azd hooks run
.
We also considered having something like azd lauch python-app.py
, but again, this means folks depending on azd to run their apps/scripts.
In a world where we want people to just run python foo.py
and have it easy to pick azd environment values, I would probably aim for a python lib (or sdk) for azd. Folks would just add that lib to the list of requirements and adding it to the top of their app would handle pulling the azd env's values. Sounds like a good open source project to me :) LMK what you think @pamelafox
Yeah I tend to agree, for Python we would like to be able to say "python bla.py" or "py bla.py" or whatever works on an OS. Or even use other Python-specific runners like the new uv
packager, which would mean "uv run bla.py".
I feel a little silly making a Pypi package out of my 10 line script, but I could do it! Or are you saying you'd do it? I wouldn't have it auto import, as we sometimes only want to pull them in when we're running locally. So for azure-search-openai-demo, I first check an env var like "RUNNING_IN_PRODUCTION" and if not, load from azd.
I'm curious what Yohan would want for JS environments though, I've asked him to comment.
In a world where we want people to just run
python foo.py
and have it easy to pick azd environment values, I would probably aim for a python lib (or sdk) for azd. Folks would just add that lib to the list of requirements and adding it to the top of their app would handle pulling the azd env's values.
I think this is problematic because now we're saying that 'azd' abstractions are leaking into the app, and we have an extra package that we have to maintain for every language. With .env files the benefit is that you don't know where they came from, they just exist.
Creating an sdk lib for azd would bring more things. We would probably start with getting the .env values, but I think on more scenarios like
List environments, find azd projects, even calling commands from your App.
I can definitely help/collaborate to a project like this, but it would require more than one happy developer XD.
I think an sdk lib could be added under the Azurw SDK umbrella. Instead of an azure service, the target would be a local azd service or just the cli
@richardpark-msft I like .env files, but then we have the issues I discussed above, where the variables leak into the global environment and get picked up by azd on the next run. Please, when you switch environments, we need to remember to update the .env. So we'd need to address those issues if we wanted to keep using .env files as our standard approach.
In the .env libraries I've used they usually have an option of specifying the actual .env file you load (with the default being .env
!). Would this be solveable if we could just have built-in filtering to the env get-values
command?
Looking at it now, it has --environment string
. Allowing me to specify a list of variables, or even wildcard/regex would be enough for me to easily compose an .env file without too much trouble.
In almost all contexts I've been working with (and not only JS/Node.js contexts), .env
files are standard to set up local dev environments, and developers are used to work with them.
The fact that .env
files leak into the global env is really not great and would probably need a separate issue sent to the Python extension, but the fact that azd env get-values
produces unwanted extra env vars is not good either:
.env
more complicated than needed (making it harder to understand how a project/sample is set up)@richardpark-msft I'm not sure what you means by having built-in filtering to env get-values
, but having something like this would definitely help the issue:
azd env get-values > .env
would ONLY output .env
file set as output in the infra, or set manually with azd env set
commandazd env get-values --all > .env
would output everything (if needed), as the current behavior. Introducing the extra flag would sure be a breaking change, but makes it less error-prone to unwanted scenarios.The filter could help, as AZURE_ENV_NAME is definitely the most problem-causing of the env variables. However, many people do currently have a flow where an outputted env variable is also an input env variable for main.parameters.json, so we would need to discourage that flow. Otherwise you'll have weird things leaking, like some customization for one environment leaking into an environment where you don't want it.
I'll file an issue with Python extension about the global leaking. I don't know if other language extensions also do that.
For the purpose of this specific issue, I wonder if, along the lines of what @richardpark-msft is proposing, azd
could provide a filter mechanism with get-values
. Either:
azd env get-values --filter APP_*
- Glob expression filterazd env get-values --filter APP_
- Regular expression filterI'd lean towards regex here -- most times ^APP_
works just as fine as APP_
, and regex would be most flexible.
I also wonder if, we may want to think about:
azd env get-values --filter APP_ --set-or-append .env
- where --set-or-append
would only append or set keys that are present in azd env
.
In general, the get-values
gesture needs to cater towards "easy app settings referencing needs". In the future, this should expand to more output formats, like appsettings.json
for .NET developers.
What isn't covered here by this simplistic proposal, is that there are cases where users want environment values to "flow" into their client-side builds without the necessary exporting of .env
-- this was summarized by @sinedied previously in #3456.
people do currently have a flow where an outputted env variable is also an input env variable for main.parameters.json, so we would need to discourage that flow.
I created #4387 since this is also a topic of interest that I think about a lot. Would love to hear your feedback here.
Should we be prefixing certain env variables with APP_ then? We don't have a convention currently, so I wouldnt have a regex that would work with my current templates, but I could move towards a convention.
@weikanglim Adding a prefix to get the filtering we want would not be working here: many frameworks or SDK needs specific env vars names, and most Azure tools use AZURE_* for env vars, which is also used by AZD.
@sinedied Happy to learn more from your example.
In my mind, with the simplistic model, you could simply rerun azd env get-values
targeting all the variables you care about, for example:
azd env get-values --filter AZURE_CLIENT_ID --set-or-append web/.env
azd env get-values --filter ^VITE_ --set-or-append web/.env
We can also build towards something where "app referencing" is more of a first-party concept that can be expressed via some azure.yaml configuration -- happy to learn from any observations you have.
Should we be prefixing certain env variables with APP_ then? We don't have a convention currently, so I wouldnt have a regex that would work with my current templates, but I could move towards a convention.
If we're talking full regexes we can also do something like this:
(var1|var2|var3)
Using alternation, and that would also be valid. So just having that support, on it's own, would be enough to specify all the variables we want to grab.
I likely would not use a complex regex and just build the file using individual azd env get-value calls as I'm doing now, but there's still the issue that azd environment variables can be tainted by the global env variables.
Stepping back a bit, what if we could be more specific about the env variable references in main.parameters.json? Right now, something like $AZURE_OPENAI_KEY can come from both .azure/CURRENT-ENV/.env or come from the global environment. That caused so many issues for developers with one of my templates, because I had mistakenly named my azd variable the same as a commonly set environment variable, without realizing it, but the value didnt have the same meaning.
What if we were instead explicit in main.parameters.json, like:
$azdenv:AZURE_OPENAI_KEY
And that value could only come from the current azd environment?
Then we'd have less accidental variable name collisions, less effects from global env variable tainting, etc.
And we could still have ones that are allowed to come from a global env, like $GITHUB_ACTIONS
I likely would not use a complex regex and just build the file using individual azd env get-value calls as I'm doing now
Just wondering, how are you currently supporting users that already have an existing .env
file?
For example, if I have RUNNING_IN_PRODUCTION
already stored in .env
, and the scenario is that I want to run azd provision
, and expect after provisioning, azd
would update the AZURE_KEYVAULT_ENDPOINT
variables but keep RUNNING_IN_PRDOUCTION
, and other variables intact.
Then we'd have less accidental variable name collisions
One thing that could help here: if azd encourages/supports a mapping of AZURE_VAR_xxx
instead of AZURE_xxx
. I think this makes it very intentional on the environment variable being present. Happy to discuss this further on #4404.
For situations where users start of with a .env file, then I usually just tell them how to update that file after running up
, so I say "Copy the value from azd env get-value AZURE_KEYVAULT_ENDPOINT into the .env file". That's slightly error-prone if they paste the wrong value, but it does mean I can support the "local development first" scenario.
Another option would be to write a shell script that auto-updated a .env according to azd env get-value
.
What if we were instead explicit in main.parameters.json, like:
$azdenv:AZURE_OPENAI_KEY
And that value could only come from the current azd environment?
Would the output variables could also be given a standard prefix? If the variable's also get formatted with a specific name then it could easily be regexed against in other spots as well.
@sinedied Happy to learn more from your example.
In my mind, with the simplistic model, you could simply rerun
azd env get-values
targeting all the variables you care about, for example:azd env get-values --filter AZURE_CLIENTID --set-or-append web/.env azd env get-values --filter ^VITE --set-or-append web/.env We can also build towards something where "app referencing" is more of a first-party concept that can be expressed via some azure.yaml configuration -- happy to learn from any observations you have.
I really don't think using using regex or filters this way when you need to extract multiple values is user-friendly. It might work for our samples which have a small amount of env vars, but in real world scenario you have dozens of vars, and you don't always control their naming as it comes from frameworks and libs requirements.
What I hear as feedbacks from customers is that they're looking for more control over what the tooling does (automatically), not more complexity and I think this falls in this use case.
What @pamelafox is proposing for avoiding conflict seems more in the right direction, and I would even go further to explicitly "namespace" all input vars, similar to how you do it in GitHub Actions for example:
"value": "$azd.AZURE_LOCATION"
for values generated/coming from AZD"value": "$env.OPENAI_API_KEY"
for env values@pamelafox drew my attention to this thread after I mentioned to her I created a small Python azd env loading library, while it doesn't cover the full scope of this discussion, it addresses some of the use cases for Python applications so I figured it might be useful to share here: https://pypi.org/project/dotenv-azd/
I personally use that lib to switch environments using azd env select
and just run my Python scripts knowing they're azd aware and will just pick up whatever env vars are in the currently selected environment.
IIRC, at the top of the wish-list is to use the python app as the starting point.
For example, we currently support running hooks where azd automatically set all the .env values as env vars for the script defined in the hook, but this requires folks to use azd as the starting point. Either by running the azd command that triggers the hook, or by running
azd hooks run
.We also considered having something like
azd lauch python-app.py
, but again, this means folks depending on azd to run their apps/scripts.In a world where we want people to just run
python foo.py
and have it easy to pick azd environment values, I would probably aim for a python lib (or sdk) for azd. Folks would just add that lib to the list of requirements and adding it to the top of their app would handle pulling the azd env's values. Sounds like a good open source project to me :) LMK what you think @pamelafox
I like the azd launch <command>
idea, it covers a lot of use cases and allows to keep the code azd agnostic. I don't believe it makes the people depend on azd. They can use whatever env loading technic they want but this gives them the option of an easy path.
May I suggest run
instead of launch
? and maybe put it under azd env run
for consistency with other env
related commands.
We currently have many templates that need access to azd environment variables to be able to run either hooks, scripts, or local dev server.
There are two ways that templates often do that:
Write the full azd env into a .env file, and then load it with a language package like python-dotenv:
azd env get-values > .env
Use shell commands to write the env variables into the environment, and call programs from the shell script:
Why those are bad
Both of these approaches are problematic as they can leak the azd environment variables into the global environment.
For example, the " > .env" approach leaks into the global environment when you're using the Python extension, as that extension (as a default behavior) automatically copies .env variables into the global environment. It took me months to figure out why my global env was getting tainted constantly.
The Powershell code above can also leak into the Windows shell, depending on how the rest of the script issues commands.
It is very bad when the full azd env variables leak into a global environment, since they include AZURE_ENV_NAME, AZURE_LOCATION, AZURE_SUBSCRIPTION_ID. If you then try to switch environments, you will find azd constantly trying to deploy with the values of the old environment. It's very confusing and caused me days of work over the last year trying to figure out what was happening.
Better approaches
I am now taking one of two approaches:
1) Using a script to auto-write only the necessary variables, and making sure those variables aren't also inputs in main.parameters.json: https://github.com/Azure-Samples/azure-openai-keyless-python/pull/7/files#diff-129e0db6b0e28f105813de4b3029d708f8012191253104aaadc5086e69a51aa3
That's not super robust, since it has the constraint that you can't also have those variables as inputs, but it can work for some simple samples.
2) Using a Python script to dynamically load in the current azd environment, using python-dotenv, so that it only ever is used inside that Python program..
https://github.com/Azure-Samples/azure-search-openai-demo/pull/1986/files#diff-6099ee740b8b4a7f97ac1e1dfff11776df721ed84c635e510c9aba8f922ca612
That is my current preferred approach, though it has the drawback of feeling a little overly complex for samples that are designed as teaching samples.
3) We provide vscode tasks that use the azd-provided dotenv as well, but that only works if you're running from VS Code, and we need to provide non-VS Code scripts as well.
EVEN better approaches??
These are related issues and PRs around this issue:
https://github.com/Azure/azure-dev/pull/4078
https://github.com/Azure/azure-dev/pull/4131
https://github.com/Azure/azure-dev/issues/1163
https://github.com/Azure/azure-dev/issues/4067