Closed nartal1 closed 1 month ago
Since this updates region in our env_vars and not the actual CLI configuration, CLI cmds such as aws emr describe-cluster --cluster-id {cluster_id} might crash because it will try to get region from the CLI config. We might have to add the region explicitly in these CLI cmds.
Yes @parthosa ! You are correct. Region becomes mandatory when --cluster
is provided as the argument. We get a better error now to set it explicitly:
Error invoking CMD <aws emr list-clusters --query 'Clusters[?Name==`emr_perfio_on_filecache_on_us_east_1a`]'>:
|
| You must specify a region. You can also configure your region by running "aws configure".
This fixes https://github.com/NVIDIA/spark-rapids-tools/issues/1018.
This PR fallsback to default region/zone(where applicable) for CLI command when Region/Zone is not set by the user. Earlier it would throw an error which doesn't show the exact reason though. With this PR, it continues with the default region with a warning that Region was not set and using the default values from environment variable.
In addition to it, updated the way the remaining environment variables are set in sp_types.py. Earlier condition would miss some of the environment variables.
Tested it on platform:
databricks-azure has already default defined.
Dataproc failure
Dataproc completion with this PR
EMR failure
EMR completion with this PR