microsoft / AzureTRE

An accelerator to help organizations build Trusted Research Environments on Azure.
https://microsoft.github.io/AzureTRE
MIT License
182 stars 139 forks source link

Troubleshooting Azure CycleCloud Deployment Steps #3933

Closed BiologyGeek closed 1 day ago

BiologyGeek commented 4 months ago

Hello team,

I wanted to deploy Azure CycleCloud and have referred to this document. However, I encountered some confusion while attempting to deploy this service. After running the relevant command to approve the license terms, I tried the following commands using a dev container but without success:

$ make cyclecloud
$ make shared_service_bundle BUNDLE=cyclecloud
$ make workspace_service_bundle BUNDLE=cyclecloud

$ cd AzureTRE/templates/shared_services/cyclecloud
➜ .../templates/shared_services/cyclecloud/terraform $ bash deploy.sh

The document states:

The CycleCloud shared service template needs registering with the TRE as per <../../tre-admins/registering-templates/> The templates can be found at templates/shared_services/cyclecloud.

I am wondering which method is correct for registering this service. What went wrong? Is this method correct, or did I miss something?

Also, by opening this URL (https://cyclecloud-{My-Own-TRE_ID}.{My-Selected-LOCATION}.cloudapp.azure.com/) in the web browser of a virtual machine, the browser displays the 'DNS_PROBE_FINISHED_NXDOMAIN' error.

Danny-Cooke-CK commented 4 months ago

Hi. I would always default to these instructions for any bundle. https://microsoft.github.io/AzureTRE/latest/tre-admins/registering-templates/ What version of AzureTRE are you using ?

BiologyGeek commented 4 months ago

Hi. I would always default to these instructions for any bundle. https://microsoft.github.io/AzureTRE/latest/tre-admins/registering-templates/ What version of AzureTRE are you using ?

Thank you @Danny-Cooke-CK!

After trying to run $ make bundle-build DIR=templates/shared_services/cyclecloud, this error occurred:

@MyCodespaceUsername ➜ /workspaces/AzureTRE-Deployment (main) $ make bundle-build DIR=templates/shared_services/cyclecloud

»»» 🧩  Building templates/shared_services/cyclecloud bundle with Porter...

╔══════════════════════════════════════╗
║          Azure TRE Makefile          ║
╚══════════════════════════════════════╝

»»» ✅ Checking pre-reqs...

Checking for Azure CLI...

Loading local environment variables...

Checking for Docker...

Checking for porter...
»»» 🔨 Azure details from logged on user 
»»»   • Subscription: My Azure Subscription
»»»   • Tenant:       My Tenant ID

docker ps failed, setting docker.sock permissions
/bin/bash: line 4: cd: templates/shared_services/cyclecloud: No such file or directory
make: *** [/home/vscode/AzureTRE/Makefile:189: bundle-build] Error 1

Also after trying to run $ make workspace_service_bundle BUNDLE=cyclecloud, this error occurred:

@MyCodespaceUsername ➜ /workspaces/AzureTRE-Deployment (main) $ make workspace_service_bundle BUNDLE=cyclecloud
make bundle-build bundle-publish bundle-register \
DIR="/home/vscode/AzureTRE//templates/workspace_services/cyclecloud" BUNDLE_TYPE=workspace_service
make[1]: Entering directory '/workspaces/AzureTRE-Deployment'

»»» 🧩  Building /home/vscode/AzureTRE//templates/workspace_services/cyclecloud bundle with Porter...

╔══════════════════════════════════════╗
║          Azure TRE Makefile          ║
╚══════════════════════════════════════╝

»»» ✅ Checking pre-reqs...

Checking for Azure CLI...

Loading local environment variables...

Checking for Docker...

Checking for porter...
»»» 🔨 Azure details from logged on user 
»»»   • Subscription: My Azure Subscription
»»»   • Tenant:       My Tenant ID

/bin/bash: line 4: cd: /home/vscode/AzureTRE//templates/workspace_services/cyclecloud: No such file or directory
make[1]: *** [/home/vscode/AzureTRE/Makefile:189: bundle-build] Error 1
make[1]: Leaving directory '/workspaces/AzureTRE-Deployment'
make: *** [/home/vscode/AzureTRE/Makefile:283: workspace_service_bundle] Error 2

The same issue occurred with $ make bundle-build DIR=templates/workspace_services/azureml.

@MyCodespaceUsername ➜ /workspaces/AzureTRE-Deployment (main) $ make bundle-build DIR=templates/workspace_services/azureml

»»» 🧩  Building templates/workspace_services/azureml bundle with Porter...

╔══════════════════════════════════════╗
║          Azure TRE Makefile          ║
╚══════════════════════════════════════╝

»»» ✅ Checking pre-reqs...

Checking for Azure CLI...

Loading local environment variables...

Checking for Docker...

Checking for porter...
»»» 🔨 Azure details from logged on user 
»»»   • Subscription: My Azure Subscription
»»»   • Tenant:       My Tenant ID

docker ps failed, setting docker.sock permissions
/bin/bash: line 4: cd: templates/workspace_services/azureml: No such file or directory
make: *** [/home/vscode/AzureTRE/Makefile:189: bundle-build] Error 1

But $ make workspace_service_bundle BUNDLE=azureml worked.


Update: No success with these commands:

$ make workspace_bundle DIR=templates/shared_services/cyclecloud $ make shared_service_bundle DIR=templates/shared_services/cyclecloud $ make user_resource_bundle DIR=templates/shared_services/cyclecloud

@MyCodespaceUsername ➜ /workspaces/AzureTRE-Deployment (main) $ make workspace_bundle DIR=templates/shared_services/cyclecloud
make bundle-build bundle-publish bundle-register \
DIR="/home/vscode/AzureTRE//templates/workspaces/" BUNDLE_TYPE=workspace
make[1]: Entering directory '/workspaces/AzureTRE-Deployment'

»»» 🧩  Building /home/vscode/AzureTRE//templates/workspaces/ bundle with Porter...

╔══════════════════════════════════════╗
║          Azure TRE Makefile          ║
╚══════════════════════════════════════╝

»»» ✅ Checking pre-reqs...

Checking for Azure CLI...

Loading local environment variables...

Checking for Docker...

Checking for porter...
»»» 🔨 Azure details from logged on user 
»»»   • Subscription: My Azure Subscription
»»»   • Tenant:       My Tenant ID

Error: open porter.yaml: no such file or directory
Error: open porter.yaml: no such file or directory
make[1]: *** [/home/vscode/AzureTRE/Makefile:189: bundle-build] Error 1
make[1]: Leaving directory '/workspaces/AzureTRE-Deployment'
make: *** [/home/vscode/AzureTRE/Makefile:279: workspace_bundle] Error 2

and no success with these commands: $ make workspace_bundle cyclecloud $ make shared_service_bundle cyclecloud $ make user_resource_bundle cyclecloud

@MyCodespaceUsername ➜ /workspaces/AzureTRE-Deployment (main) $ make workspace_bundle cyclecloud
make bundle-build bundle-publish bundle-register \
DIR="/home/vscode/AzureTRE//templates/workspaces/" BUNDLE_TYPE=workspace
make[1]: Entering directory '/workspaces/AzureTRE-Deployment'

»»» 🧩  Building /home/vscode/AzureTRE//templates/workspaces/ bundle with Porter...

╔══════════════════════════════════════╗
║          Azure TRE Makefile          ║
╚══════════════════════════════════════╝

»»» ✅ Checking pre-reqs...

Checking for Azure CLI...

Loading local environment variables...

Checking for Docker...

Checking for porter...
»»» 🔨 Azure details from logged on user 
»»»   • Subscription: My Azure Subscription
»»»   • Tenant:       My Tenant ID

Error: open porter.yaml: no such file or directory
Error: open porter.yaml: no such file or directory
make[1]: *** [/home/vscode/AzureTRE/Makefile:189: bundle-build] Error 1
make[1]: Leaving directory '/workspaces/AzureTRE-Deployment'
make: *** [/home/vscode/AzureTRE/Makefile:279: workspace_bundle] Error 2
BiologyGeek commented 4 months ago

Surprisingly, after re-running this command, it seems to have worked: $ make shared_service_bundle BUNDLE=cyclecloud. The terminal output was something like this:

Initializing modules ...
Downloading git :: https://github.com/microsoft/terraform-azurerm-environment-configuration.git?ref=0.2.0 for terraform_azurerm_environment_configuration ...
- terraform_azurerm_environment_configuration in .terraform/modules/terraform_azurerm_environment_configuration
Initializing provider plugins ...
Reusing previous version of hashicorp/random from the dependency lock file
Reusing previous version of hashicorp/azurerm from the dependency lock file
Installing hashicorp/random v3.4.2 ..
Installed hashicorp/random v3.4.2 (signed by HashiCorp)
Installing hashicorp/azurerm v3.5.0 ...
Installed hashicorp/azurerm v3.5.0 (signed by HashiCorp)

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
Success! The configuration is valid.

Login Succeeded
CLI already signed in
Registering template ...
id

tre-shared-service-cyclecloud  Azure CycleCloud Azure CycleCloud is an enterprise-friendly tool for orchestrating and managing High Performance Computing (HPC) environments on Azure.
make[1]: Leaving directory '/workspaces/AzureTRE-Deployment'

Now the 'Azure CycleCloud' option is visible in the 'Shared Services' section of the TRE GUI. After clicking on the 'Create' button for 'Azure CycleCloud' and submitting, this error occurred: deployment failed - There was an issue with the latest deployment or update for this resource. Please see the Operations panel within the resource for details.

Here is the content of the Operations panel:

Resource Id: ################################
Resource Path: /shared-services/################################
Resource Version: 0
Status: deployment_failed
Action: install
Message: ################################: Error message: Unable to find image 'mytreacr.azurecr.io/tre-shared-service-cyclecloud@sha256:################################' locally ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_secret" was assigned on the command line, but │ the root module does not declare a variable of that name. To use this │ value, add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_tenant_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_use_msi" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=******* -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=******* -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=******* -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=******* -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 1 error occurred: * mixin execution failed: package command failed /cnab/app/cnab/app/mixins/terraform/runtimes/terraform-runtime install ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_secret" was assigned on the command line, but │ the root module does not declare a variable of that name. To use this │ value, add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_tenant_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_use_msi" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=################################ -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=################################ -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=################################ -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=################################ -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 1 error occurred: * mixin execution failed: package command failed /cnab/app/cnab/app/mixins/terraform/runtimes/terraform-runtime install ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_secret" was assigned on the command line, but │ the root module does not declare a variable of that name. To use this │ value, add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_tenant_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_use_msi" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=################################ -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=################################ -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=################################ -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=################################ -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 2 errors occurred: * container exit code: 1, message: <nil> * required output connection_uri is missing and has no default ; Command executed: porter install "################################" --reference mytreacr.azurecr.io/tre-shared-service-cyclecloud:v0.5.5 --param arm_environment="public" --param arm_use_msi="true" --param azure_environment="AzureCloud" --param id="################################" --param tfstate_container_name="tfstate" --param tfstate_resource_group_name="MyTRE" --param tfstate_storage_account_name="mytrestorage" --param tre_id="mytre" --force --credential-set arm_auth --credential-set aad_auth

Created: Sun May 19 2024 05:21:32 GMT+0100 (a day ago)
Updated: Sun May 19 2024 05:23:07 GMT+0100 (a day ago)
User: My User
Steps
1) Main step for ################################
################################: Error message: Unable to find image 'mytreacr.azurecr.io/tre-shared-service-cyclecloud@sha256:################################' locally ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_secret" was assigned on the command line, but │ the root module does not declare a variable of that name. To use this │ value, add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_tenant_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_use_msi" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=******* -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=******* -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=******* -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=******* -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 1 error occurred: * mixin execution failed: package command failed /cnab/app/cnab/app/mixins/terraform/runtimes/terraform-runtime install ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_secret" was assigned on the command line, but │ the root module does not declare a variable of that name. To use this │ value, add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_tenant_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_use_msi" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=################################ -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=################################ -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=################################ -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=################################ -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 1 error occurred: * mixin execution failed: package command failed /cnab/app/cnab/app/mixins/terraform/runtimes/terraform-runtime install ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_client_secret" was assigned on the command line, but │ the root module does not declare a variable of that name. To use this │ value, add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_tenant_id" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ ╷ │ Error: Value for undeclared variable │  │ A variable named "arm_use_msi" was assigned on the command line, but the │ root module does not declare a variable of that name. To use this value, │ add a "variable" block to the configuration. ╵ error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=################################ -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=################################ -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var arm_client_id=################################ -var arm_client_secret= -var arm_environment=public -var arm_tenant_id=################################ -var arm_use_msi=true -var tre_id=mytre -var tre_resource_id=################################: exit status 1 2 errors occurred: * container exit code: 1, message: <nil> * required output connection_uri is missing and has no default ; Command executed: porter install "################################" --reference mytreacr.azurecr.io/tre-shared-service-cyclecloud:v0.5.5 --param arm_environment="public" --param arm_use_msi="true" --param azure_environment="AzureCloud" --param id="################################" --param tfstate_container_name="tfstate" --param tfstate_resource_group_name="MyTRE" --param tfstate_storage_account_name="mytrestorage" --param tre_id="mytre" --force --credential-set arm_auth --credential-set aad_auth

Additional information: UI Version: 0.5.21 API Version: 0.18.5 Cosmos DB: OK Service Bus: OK Resource Processor: OK


Also, I tried to delete the Azure CycleCloud service using the GUI by clicking on the 'Delete' button, but this error occurred: Deleting failed - There was an issue with the latest deployment or update for this resource. Please see the Operations panel within the resource for details.


What could be the root cause of this deployment failure? @marrobi, is this issue still open(https://github.com/microsoft/AzureTRE/issues/2406), and to have a successful deployment, do some additional ports need to be opened, or was a step missed on my side?

tim-allen-ck commented 4 months ago

hi @BiologyGeek Im looking into the issue. I believe the problem is these 4 lines in the porter.yaml file in the cyclecloud folder

arm_client_id: ${ bundle.credentials.azure_client_id }
arm_client_secret: ${ bundle.credentials.azure_client_secret }
arm_tenant_id: ${ bundle.credentials.azure_tenant_id }
arm_use_msi: ${ bundle.parameters.arm_use_msi }

they need removing, then try to build and publish the service.

BiologyGeek commented 4 months ago

hi @BiologyGeek Im looking into the issue. I believe the problem is these 4 lines in the porter.yaml file in the cyclecloud folder

arm_client_id: ${ bundle.credentials.azure_client_id }
arm_client_secret: ${ bundle.credentials.azure_client_secret }
arm_tenant_id: ${ bundle.credentials.azure_tenant_id }
arm_use_msi: ${ bundle.parameters.arm_use_msi }

they need removing, then try to build and publish the service.

Thank you @tim-allen-ck!

After removing the mentioned lines from the porter.yaml file in the CycleCloud folder and running the commands, new results appeared as mentioned below.

Question: By running $ make bundle-publish and $ make bundle-register, will it override the impact of the $ make shared_service_bundle BUNDLE=cyclecloud commands that were run previously, or does something need to be manually unregistered in some way?

Result of $ make bundle-build DIR=/workspaces/AzureTRE-Deployment/AzureTRE/templates/shared_services/cyclecloud command:

Click to view ```bash @MyCodespaceUsername ➜ /workspaces/AzureTRE-Deployment (main) $ make bundle-build DIR=/workspaces/AzureTRE-Deployment/AzureTRE/templates/shared_services/cyclecloud »»» 🧩 Building /workspaces/AzureTRE-Deployment/AzureTRE/templates/shared_services/cyclecloud bundle with Porter... ╔══════════════════════════════════════╗ ║ Azure TRE Makefile ║ ╚══════════════════════════════════════╝ »»» ✅ Checking pre-reqs... Checking for Azure CLI... Loading local environment variables... Checking for Docker... Checking for porter... »»» 🔨 Azure details from logged on user »»» • Subscription: My Azure Subscription »»» • Tenant: My Tenant ID Initializing modules... Initializing provider plugins... - Reusing previous version of hashicorp/azurerm from the dependency lock file - Reusing previous version of hashicorp/random from the dependency lock file - Using previously-installed hashicorp/azurerm v3.5.0 - Using previously-installed hashicorp/random v3.4.2 Terraform has been successfully initialized! You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary. Success! The configuration is valid. Runtime image build section isn't specified. Exiting... Copying porter runtime ===> Copying mixins ===> Copying mixin exec ===> Copying mixin terraform ===> Copying mixin az ===> Building invocation image [+] Building 3.0s (24/24) FINISHED => [internal] load build definition from Dockerfile 0.1s => => transferring dockerfile: 2.39kB 0.0s => [internal] load .dockerignore 0.1s => => transferring context: 87B 0.0s => resolve image config for docker.io/docker/dockerfile-upstream:1.4.0 0.4s => CACHED docker-image://docker.io/docker/dockerfile-upstream:1.4.0@################################ 0.0s => [internal] load metadata for docker.io/library/debian:bullseye-slim 0.4s => [stage-0 1/17] FROM docker.io/library/debian:bullseye-slim@sha256:################################ 0.0s => [internal] load build context 0.9s => => transferring context: 178.61MB 0.9s => CACHED [stage-0 2/17] RUN useradd nonroot -m -u 65532 -g 0 -o 0.0s => CACHED [stage-0 3/17] RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache 0.0s => CACHED [stage-0 4/17] RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt apt-get update && apt-get install -y git --no-install-recomm 0.0s => CACHED [stage-0 5/17] RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt apt-get update && apt-get install -y wget unzip && wget https: 0.0s => CACHED [stage-0 6/17] COPY terraform/ /cnab/app/terraform/ 0.0s => CACHED [stage-0 7/17] RUN cd /cnab/app/terraform && terraform init -backend=false && rm -fr .terraform/providers && terraform providers mirror /usr/local/share/terrafo 0.0s => CACHED [stage-0 8/17] RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt apt-get update && apt-get install -y apt-transport-https lsb-re 0.0s => CACHED [stage-0 9/17] RUN curl -sL https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > /etc/apt/trusted.gpg.d/microsoft.asc.gpg 0.0s => CACHED [stage-0 10/17] RUN echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $(lsb_release -cs) main" > /etc/apt/sources.list.d/azure-cli.list 0.0s => CACHED [stage-0 11/17] RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt apt-get update && apt-get install -y --no-install-recommends a 0.0s => CACHED [stage-0 12/17] COPY --link . /cnab/app/ 0.0s => CACHED [stage-0 13/17] RUN rm /cnab/app/porter.yaml 0.0s => CACHED [stage-0 14/17] RUN rm -fr /cnab/app/.cnab 0.0s => CACHED [stage-0 15/17] COPY --link .cnab /cnab 0.0s => CACHED [stage-0 16/17] RUN chgrp -R 0 /cnab && chmod -R g=u /cnab 0.0s => CACHED [stage-0 17/17] WORKDIR /cnab/app 0.0s => exporting to image 0.0s => => exporting layers 0.0s => => writing image sha256:################################ 0.0s => => naming to docker.io/azuretre/tre-shared-service-cyclecloud:porter-################################ 0.0s @MyCodespaceUsername ➜ /workspaces/AzureTRE-Deployment (main) $ ```

Result of $ make bundle-publish DIR=/workspaces/AzureTRE-Deployment/AzureTRE/templates/shared_services/cyclecloud command:

Click to view ```bash @MyCodespaceUsername ➜ /workspaces/AzureTRE-Deployment (main) $ make bundle-publish DIR=/workspaces/AzureTRE-Deployment/AzureTRE/templates/shared_services/cyclecloud »»» 🧩 Publishing /workspaces/AzureTRE-Deployment/AzureTRE/templates/shared_services/cyclecloud bundle with Porter... ╔══════════════════════════════════════╗ ║ Azure TRE Makefile ║ ╚══════════════════════════════════════╝ »»» ✅ Checking pre-reqs... Checking for Azure CLI... Loading local environment variables... Checking for Docker... Checking for porter... »»» 🔨 Azure details from logged on user »»» • Subscription: My Azure Subscription »»» • Tenant: My Tenant ID Login Succeeded Runtime image build section isn't specified. Exiting... Pushing bundle image... The push refers to repository [mytreacr.azurecr.io/tre-shared-service-cyclecloud] 5f70bf18a086: Preparing d992a0b8684a: Preparing 0e6d440dc846: Preparing e22890b3ec69: Preparing b70c91c35aa7: Preparing 5f53b59f1bb1: Preparing ff8e8c6613ec: Preparing b25f53bb3c44: Preparing 91020b678af9: Preparing d9cbfc0cfd7c: Preparing 02108fbe706d: Preparing e94acffa0726: Preparing 36ca7c5fba56: Preparing a95fa2f53314: Preparing d99b6712e471: Preparing d7d858f4adae: Preparing 123eef91533f: Preparing d9cbfc0cfd7c: Waiting 02108fbe706d: Waiting e94acffa0726: Waiting 36ca7c5fba56: Waiting a95fa2f53314: Waiting d99b6712e471: Waiting d7d858f4adae: Waiting 123eef91533f: Waiting 5f53b59f1bb1: Waiting ff8e8c6613ec: Waiting b25f53bb3c44: Waiting 91020b678af9: Waiting b70c91c35aa7: Layer already exists e22890b3ec69: Layer already exists d992a0b8684a: Layer already exists 0e6d440dc846: Layer already exists 5f70bf18a086: Layer already exists 5f53b59f1bb1: Layer already exists ff8e8c6613ec: Layer already exists b25f53bb3c44: Layer already exists d9cbfc0cfd7c: Layer already exists 91020b678af9: Layer already exists 02108fbe706d: Layer already exists e94acffa0726: Layer already exists 36ca7c5fba56: Layer already exists a95fa2f53314: Layer already exists d7d858f4adae: Layer already exists 123eef91533f: Layer already exists d99b6712e471: Layer already exists porter-################################: digest: sha256:################################ size: 3885 Rewriting CNAB bundle.json... Starting to copy image mytreacr.azurecr.io/tre-shared-service-cyclecloud@sha256:################################... Completed image mytreacr.azurecr.io/tre-shared-service-cyclecloud@sha256:################################ copy Bundle rmytreacr.azurecr.io/tre-shared-service-cyclecloud:v0.5.5 pushed successfully, with digest "sha256:################################" @MyCodespaceUsername ➜ /workspaces/AzureTRE-Deployment (main) $ ```

Result of $ make bundle-register DIR=/workspaces/AzureTRE-Deployment/AzureTRE/templates/shared_services/cyclecloud BUNDLE_TYPE=cyclecloud command:

Click to view ```bash @MyCodespaceUsername ➜ /workspaces/AzureTRE-Deployment (main) $ make bundle-register DIR=/workspaces/AzureTRE-Deployment/AzureTRE/templates/shared_services/cyclecloud BUNDLE_TYPE=shared_service »»» 🧩 Registering /workspaces/AzureTRE-Deployment/AzureTRE/templates/shared_services/cyclecloud bundle... ╔══════════════════════════════════════╗ ║ Azure TRE Makefile ║ ╚══════════════════════════════════════╝ »»» ✅ Checking pre-reqs... Checking for Azure CLI... Loading local environment variables... Checking for Docker... Checking for porter... »»» 🔨 Azure details from logged on user »»» • Subscription: My Azure Subscription »»» • Tenant: My Tenant ID Login Succeeded CLI already signed in Template with this version already exists @MyCodespaceUsername ➜ /workspaces/AzureTRE-Deployment (main) $ ```

Now, from the Azure portal, by going to the TRE-associated resource group and searching 'cycle', I can see these 5 resources: image

Then, I was able to uninstall the previous Azure CycleCloud shared services using the TRE GUI and then reinstall them.

Based on my understanding from the documentation, I went to: Azure Portal --> My TRE resource group --> The virtual machine associated with CycleCloud (cyclecloud-####) --> From the left blade, click on 'Bastion' --> Then select 'Authentication Type: VM Password' and fill in the Username and VM Password from the values in 'Key Vault', then click 'Connect'. image

This page appeared inside the Bastion window: image

But based on the documentation, I assumed I would see a page like this: image

Question: Could you please guide me on whether some deployment steps were missed or if I went to the wrong path?

tim-allen-ck commented 4 months ago

Hi @BiologyGeek According to the docs, you'll need to head to the public url of the cyclecloud instance. https://cyclecloud-{TRE_ID}.{LOCATION}.cloudapp.azure.com/ https://microsoft.github.io/AzureTRE/v0.16.0/tre-templates/shared-services/cyclecloud/

tim-allen-ck commented 4 months ago

Regarding

Question: By running $ make bundle-publish and $ make bundle-register, will it override the impact of the $ make shared_service_bundle BUNDLE=cyclecloud commands that were run previously, or does something need to be manually unregistered in some way?

It should only override if you've edited the version of the bundle in the porter.yaml

BiologyGeek commented 3 months ago

Hi @BiologyGeek According to the docs, you'll need to head to the public url of the cyclecloud instance. https://cyclecloud-{TRE_ID}.{LOCATION}.cloudapp.azure.com/ https://microsoft.github.io/AzureTRE/v0.16.0/tre-templates/shared-services/cyclecloud/

Hi @tim-allen-ck, thanks for your reply!

My source of confusion is about where I should open the URL of the CycleCloud instance because it is not accessible from the public internet. So, I assumed two potential ways:

  1. First, I tried this method: Azure Portal --> My TRE resource group --> The virtual machine associated with CycleCloud (cyclecloud-####) --> From the left blade, click on 'Bastion' --> Then select 'Authentication Type: VM Password' and fill in the Username and VM Password from the values in 'Key Vault', then click 'Connect'.

The result of this method was a black page that shows 'OpenLogic CentOS7.9' in a terminal-like screen, as I shared above. Should I run any specific commands within this terminal window?

  1. Second, I tried this method: My TRE Web Interface --> Base Workspace --> User Resource --> Resources --> Data Science VM (Guacamole) --> Microsoft Edge Browser --> Enter https://cyclecloud-mytre.eastus.cloudapp.azure.com/ in the browser.

The result was:

Hmmm ... can't reach this page

cyclecloud-mytre.eastus.cloudapp.azure.com's DNS address could not be found ...
diagnosing the problem now.

Try running Windows Network Diagnostics.

DNS_PROBE_STARTED

Refresh

And after a few seconds:

Hmmm ... can't reach this page

Check if there is a typo in cyclecloud-mytre.eastus.cloudapp.azure.com.

If spelling is correct, try running Windows Network Diagnostics.

DNS_PROBE_FINISHED_NXDOMAIN

Refresh

image

My question is, which method should work, the first or the second?


Update: I'm still curious about which of the two methods is correct. However, I tried another method, and it helped me view the Azure CycleCloud page in the web browser within the Data Science VM (Guacamole).

Here's what I did: Azure Portal --> My TRE resource group --> The 'Private DNS zone' associated with CycleCloud --> Copied the value of the A record, which is a private IP address -->Entered the copied private IP address into the Microsoft Edge browser running on the Data Science VM (Guacamole) within TRE.

Now I can see the Azure CycleCloud page: image

The current challenge is that when I click the "Validate Credentials" button within the Add Subscription window, an error message appears:

Connection Erron: GET https//managemert.azure.com/subscriptions/###################/providers/Microsoft.Storage/storageAccounts?api-version=2019-04-01 - Remote host terminated the handshake

image

The questions are:

  1. Why can't the web browser on TRE VMs open the Azure CycleCloud page using the URL? How can this be diagnosed and fixed?
  2. Could the error message and handshake termination be because I used a private IP address to join the Azure CycleCloud panel? Since it doesn't contain an SSL certificate, the web browser might be showing an SSL warning as well.
tim-allen-ck commented 3 months ago

Hi @BiologyGeek, I'll have a look at this next week and get back to you.

BiologyGeek commented 3 months ago

Hi @BiologyGeek, I'll have a look at this next week and get back to you.

Thank you @tim-allen-ck!

These two items might be helpful for diagnostics:

This is a screenshot of the 'Network settings' window for the virtual machine associated with Azure CycleCloud within TRE: image

This is a screenshot of the 'Private DNS zone' resource associated with Azure CycleCloud within TRE: image


Update: Whitelisting these ports (22, 111, 2049, 80, 443) (https://github.com/microsoft/AzureTRE/issues/2406) for both inbound and outbound for any protocol on a Network Security Group named 'nsg-default-rules' didn't help resolve those two issues: image

BiologyGeek commented 3 months ago

Team, are there any other tests or scenarios that I can run to make troubleshooting easier? If so, please kindly provide some insights so I can perform the tests and share the results.

tim-allen-ck commented 3 months ago

hi @BiologyGeek, sorry it's taken a while. I think I've fixed the management API issue, looks like it needs an extra rule in the firewall to allow 80 and 443 access to the management.azure.com fqdn

image

also as per the docs I've been testing this in the admin VM which I connected to via Bastion and that seems to work using the private dns address and that seems to be working

image
BiologyGeek commented 3 months ago

hi @BiologyGeek, sorry it's taken a while. I think I've fixed the management API issue, looks like it needs an extra rule in the firewall to allow 80 and 443 access to the management.azure.com fqdn image

Thank you @tim-allen-ck! How can I replicate this configuration on my end? I went to the following paths but couldn't find the appropriate section to whitelist those two ports for management.azure.com:

  1. Azure Portal --> My TRE resource group --> The 'Firewall Policy' resource associated with TRE (fw-policy-mytre)
  2. Azure Portal --> My TRE resource group --> The 'Firewall' resource associated with TRE (fw-mytre)

Should I edit something in the source code and then run the $ make all command?

tim-allen-ck commented 3 months ago

I added an Application rule to the arc-shared-subnet rule collection in the firewall policy manually. There's some work to edit the template to add these changes and the other fw changes into it. Plus the initial porter.yaml file changes.

BiologyGeek commented 2 months ago

I added an Application rule to the arc-shared-subnet rule collection in the firewall policy manually. There's some work to edit the template to add these changes and the other fw changes into it. Plus the initial porter.yaml file changes.

Thank you @tim-allen-ck! It worked! Now Azure CycleCloud can identify resources within the subscription. However, another challenge arose during the Slurm configuration, which is described here: https://github.com/microsoft/AzureTRE/issues/4021 Could you please take a look and provide your insight?

marrobi commented 1 day ago

Fixed in #4050