alan-turing-institute / data-safe-haven

https://data-safe-haven.readthedocs.io
BSD 3-Clause "New" or "Revised" License
57 stars 14 forks source link

error with dsh shm deploy, azureclicrdential #2184

Open mattwestby opened 1 week ago

mattwestby commented 1 week ago

When using the command dsh shm deploy to initially start deployment I receive the following errors after authentication with az login;

Are these details correct? [y/n] (y): y Please authenticate with Azure: run 'az login' using infrastructure administrator credentials. Error getting account information from Azure CLI. Could not load list of groups.

I have owner permissions on the subscription I'm attempting to deploy into and when i check my az token after az login it looks ok. Not sure what the problem is?

craddm commented 1 week ago

Hi Matt,

Could not load list of groups is generated during interactions with Graph API, so that suggests to me the issue may not be with your Azure credentials per se - so not the infrastructure admin account you log in to using az login - but with your Entra ID administrator credentials. For us, we have one account that and a separate global administrator account on the Entra ID tenant that'll be associated with the SRE/SHM.

It's possible that the path to producing the error you're receiving is not quite correct and is not quite giving you enough information to clarify the source of the problem - I'll look into that.

Cheers, Matt

mattwestby commented 1 week ago

Hi Matt,

Ah ok we do have a separate Entra ID to manage the SRE users but I've not authenticated with that during this stage? Did I need to authenticate with the global admin in that entra as part of this stage? Thanks Matt

craddm commented 1 week ago

I haven't been able to replicate your problem as yet, but it does seem to happen before you need to be authenticated with the Graph API, contrary to what I suggested above. So I think we can forget that.

Where this error seems to come up is when retrieving the admin group id from the admin group name you provide in the context. I would check that the Azure account you are logged in with is a member of an appropriate security group and has sufficient permissions (at least Contributor)(sorry, reread your original report and can see you have these!), and that the name of the security group is correct

mattwestby commented 1 week ago

Hi Matt,

So the admin group I provided in the context is a admin group that is in the “SRE” entra ID and not the same entra as the infra deployment. I have owner on the infra deployment subscription. Is that --admin-group-name meant to be in the same entra as the infra deployment tenant?

Thanks

Matt

From: Matt Craddock @.> Sent: 09 September 2024 15:28 To: alan-turing-institute/data-safe-haven @.> Cc: Matthew Westby (staff) @.>; Manual @.> Subject: Re: [alan-turing-institute/data-safe-haven] error with dsh shm deploy, azureclicrdential (Issue #2184)

I haven't been able to replicate your problem as yet, but it does seem to happen before you need to be authenticated with the Graph API, contrary to what I suggested above. So I think we can forget that.

Where this error seems to come up is when retrieving the admin group id from the admin group name you provide in the context. I would check that the Azure account you are logged in with is a member of an appropriate security group and has sufficient permissions (at least Contributor), and that the name of the security group is correct

— Reply to this email directly, view it on GitHubhttps://github.com/alan-turing-institute/data-safe-haven/issues/2184#issuecomment-2338278809, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVWWA2NSKG2HTTMEPX7BMZTZVWWALAVCNFSM6AAAAABN4A6PI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZYGI3TQOBQHE. You are receiving this because you are subscribed to this thread.Message ID: @.***>

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.

craddm commented 1 week ago

Hi Matt, So the admin group I provided in the context is a admin group that is in the “SRE” entra ID and not the same entra as the infra deployment. I have owner on the infra deployment subscription. Is that --admin-group-name meant to be in the same entra as the infra deployment tenant? Thanks Matt

Hi Matt,

Yes, the admin group name should be in the infra deployment tenant Entra - we assign roles to the groups and users to the groups rather than directly assigning roles to users.

I think the documentation could definitely be clearer on this

Cheers, Matt

mattwestby commented 1 week ago

HI Matt,

Thanks, for this what permissions are then required in the infra entra? I still get the error even though I’ve specified a group name in that entra.

And this switch on the dsh shm deploy command

“--entra-tenant-id”

This is tenant id for the “SRE” users entra?

Thanks

Matt

From: Matt Craddock @.> Sent: 09 September 2024 15:57 To: alan-turing-institute/data-safe-haven @.> Cc: Matthew Westby (staff) @.>; Manual @.> Subject: Re: [alan-turing-institute/data-safe-haven] error with dsh shm deploy, azureclicrdential (Issue #2184)

Hi Matt, So the admin group I provided in the context is a admin group that is in the “SRE” entra ID and not the same entra as the infra deployment. I have owner on the infra deployment subscription. Is that --admin-group-name meant to be in the same entra as the infra deployment tenant? Thanks Matt

Hi Matt,

Yes, the admin group name should be in the infra deployment tenant Entra - we assign roles to the groups and users to the groups rather than directly assigning roles to users.

I think the documentation could definitely be clearer on this

Cheers, Matt

— Reply to this email directly, view it on GitHubhttps://github.com/alan-turing-institute/data-safe-haven/issues/2184#issuecomment-2338354618, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVWWA2O3D5N2WG75L7CZ54DZVWZMLAVCNFSM6AAAAABN4A6PI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZYGM2TINRRHA. You are receiving this because you are subscribed to this thread.Message ID: @.***>

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.

craddm commented 1 week ago

HI Matt, Thanks, for this what permissions are then required in the infra entra? I still get the error even though I’ve specified a group name in that entra.

Our admin group is set up with the Owner and Billing Reader roles by default; Contributor should be enough. And then it's just our admins that are members of that group.

And this switch on the dsh shm deploy command “--entra-tenant-id” This is tenant id for the “SRE” users entra? Thanks Matt

Yes, that's the SRE users Entra ID

Cheers, Matt

mattwestby commented 1 week ago

HI Matt,

I’m still not quite clear on what the entra group requires, are you saying owner and billing reader / contributor on the subscription where the SHM will be deployed into?

Thanks

Matt

From: Matt Craddock @.> Sent: 09 September 2024 17:03 To: alan-turing-institute/data-safe-haven @.> Cc: Matthew Westby (staff) @.>; Manual @.> Subject: Re: [alan-turing-institute/data-safe-haven] error with dsh shm deploy, azureclicrdential (Issue #2184)

HI Matt, Thanks, for this what permissions are then required in the infra entra? I still get the error even though I’ve specified a group name in that entra.

Our admin group is set up with the Owner and Billing Reader roles by default; Contributor should be enough. And then it's just our admins that are members of that group.

And this switch on the dsh shm deploy command “--entra-tenant-id” This is tenant id for the “SRE” users entra? Thanks Matt

Yes, that's the SRE users Entra ID

Cheers, Matt

— Reply to this email directly, view it on GitHubhttps://github.com/alan-turing-institute/data-safe-haven/issues/2184#issuecomment-2338508660, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVWWA2IBJHVJYJNQRMNIVCTZVXBE5AVCNFSM6AAAAABN4A6PI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZYGUYDQNRWGA. You are receiving this because you are subscribed to this thread.Message ID: @.***>

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.

craddm commented 1 week ago

Apologies for lack of clarity. It should be sufficient to give the admin group (and note that this is a security group) the Contributor role. If that fails, try giving them Owner. We know that the admin group having Owner is enough on our end, so if that still fails, the problem is something else. I just threw in that ours also has Billing Reader as it's another role that our group has; it shouldn't be necessary, but perhaps if you want to fully replicate the roles our group has then it's handy to know it has that too. And yes these roles are for the subscription that the SRE will be deployed into.

mattwestby commented 1 week ago

Hi Matt

I assigned the owner role for that group on the subscription level but I still get this error:

@.***

So its something with the auth?

Thanks

Matt

From: Matt Craddock @.> Sent: 10 September 2024 11:17 To: alan-turing-institute/data-safe-haven @.> Cc: Matthew Westby (staff) @.>; Manual @.> Subject: Re: [alan-turing-institute/data-safe-haven] error with dsh shm deploy, azureclicrdential (Issue #2184)

Apologies for lack of clarity. It should be sufficient to give the admin group (and note that this is a security group) the Contributor role. If that fails, try giving them Owner. We know that the admin group having Owner is enough on our end, so if that still fails, the problem is something else. I just threw in that ours also has Billing Reader as it's another role that our group has; it shouldn't be necessary, but perhaps if you want to fully replicate the roles our group has then it's handy to know it has that too. And yes these roles are for the subscription that the SRE will be deployed into.

— Reply to this email directly, view it on GitHubhttps://github.com/alan-turing-institute/data-safe-haven/issues/2184#issuecomment-2340253106, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVWWA2J7NRJFCESFEMUK5Q3ZV3BJHAVCNFSM6AAAAABN4A6PI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBQGI2TGMJQGY. You are receiving this because you are subscribed to this thread.Message ID: @.***>

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.

craddm commented 1 week ago

Hi Matt,

The error seems to be missing from your last message. Is it the same error as before, about the groups?

mattwestby commented 1 week ago

Hi Matt,

Yea,

@.***

Thanks Matt

From: Matt Craddock @.> Sent: 10 September 2024 13:57 To: alan-turing-institute/data-safe-haven @.> Cc: Matthew Westby (staff) @.>; Manual @.> Subject: Re: [alan-turing-institute/data-safe-haven] error with dsh shm deploy, azureclicrdential (Issue #2184)

Hi Matt,

The error seems to be missing from your last message. Is it the same error as before, about the groups?

— Reply to this email directly, view it on GitHubhttps://github.com/alan-turing-institute/data-safe-haven/issues/2184#issuecomment-2340673691, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVWWA2KW4Q7LOJVF2PB77STZV3UABAVCNFSM6AAAAABN4A6PI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBQGY3TGNRZGE. You are receiving this because you are subscribed to this thread.Message ID: @.***>

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.

mattwestby commented 1 week ago

Hi Matt

I’ve resolved it! I had to login to Azure CLI with

az login --scope https://graph.microsoft.com/.default

this allowed me to get the token correctly.

Thanks

Matt

From: Matthew Westby (staff) Sent: 10 September 2024 13:58 To: alan-turing-institute/data-safe-haven @.>; alan-turing-institute/data-safe-haven @.> Cc: Manual @.***> Subject: RE: [alan-turing-institute/data-safe-haven] error with dsh shm deploy, azureclicrdential (Issue #2184)

Hi Matt,

Yea,

@.***

Thanks Matt

From: Matt Craddock @.**@.>> Sent: 10 September 2024 13:57 To: alan-turing-institute/data-safe-haven @.**@.>> Cc: Matthew Westby (staff) @.**@.>>; Manual @.**@.>> Subject: Re: [alan-turing-institute/data-safe-haven] error with dsh shm deploy, azureclicrdential (Issue #2184)

Hi Matt,

The error seems to be missing from your last message. Is it the same error as before, about the groups?

— Reply to this email directly, view it on GitHubhttps://github.com/alan-turing-institute/data-safe-haven/issues/2184#issuecomment-2340673691, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVWWA2KW4Q7LOJVF2PB77STZV3UABAVCNFSM6AAAAABN4A6PI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBQGY3TGNRZGE. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.

craddm commented 6 days ago

Some additional information from Matt by email. Matt was not being prompted for MFA when using az login, and was unable to acquire an access token.

This was manifesting as an error during SHM deployment at the step where it attempts to match the admin group name provided by the user to its respective group id on Azure.

I've been unable to replicate these problems locally. I tried giving incorrect admin group names and/or incorrect tenant ids in the configs, but nothing doing. I tried running the deployment from Windows, not problems observed.

For Matt, attempting to get an access token through the Azure CLI yielded the prompt to login using az login --scope https://graph.microsoft.com/.default, and that allowed him to successfully get an access token with sufficient scope over the graph API.

I'm wondering if there is some kind of default scope applied to the login that can be set in the portal through a policy of some sort, and whether that differs between our tenant and Matt's. I can't check this out as I do not have sufficient rights on our dev subscription to investigate how its Entra ID tenant is set up.

mattwestby commented 6 days ago

Thanks Matt,

I just wanted to check another point with the deployment of the SHM, ive run the dsh shm deploy and its passed the point of checking the domain verification;

@.***

And its deployed the following;

@.***

Is that the SHM deployed as it doesn’t produce any further output after verifying the domain?

Thanks Matt

From: Matt Craddock @.> Sent: 11 September 2024 15:17 To: alan-turing-institute/data-safe-haven @.> Cc: Matthew Westby (staff) @.>; Manual @.> Subject: Re: [alan-turing-institute/data-safe-haven] error with dsh shm deploy, azureclicrdential (Issue #2184)

Some additional information from Matt by email. Matt was not being prompted for MFA when using az login, and was unable to acquire an access token.

This was manifesting as an error during SHM deployment at the step where it attempts to match the admin group name provided by the user to its respective group id on Azure.

I've been unable to replicate these problems locally. I tried giving incorrect admin group names and/or incorrect tenant ids in the configs, but nothing doing. I tried running the deployment from Windows, not problems observed.

For Matt, attempting to get an access token through the Azure CLI yielded the prompt to login using az login --scope https://graph.microsoft.com/.default, and that allowed him to successfully get an access token with sufficient scope over the graph API.

I'm wondering if there is some kind of default scope applied to the login that can be set in the portal through a policy of some sort, and whether that differs between our tenant and Matt's. I can't check this out as I do not have sufficient rights on our dev subscription to investigate how its Entra ID tenant is set up.

— Reply to this email directly, view it on GitHubhttps://github.com/alan-turing-institute/data-safe-haven/issues/2184#issuecomment-2343806622, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVWWA2KWFIBXIVL7X2HOYMDZWBGHPAVCNFSM6AAAAABN4A6PI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBTHAYDMNRSGI. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.

craddm commented 6 days ago

Hi Matt,

Yes, the last two lines of the output from deploying the SHM should be something like

Ensured that DNS TXT record @ exists in zone green.develop.turingsafehaven.ac.uk.                                                          
Verified that domain green.develop.turingsafehaven.ac.uk is delegated to Azure. 

After that, the SHM is deployed and you can move on to creating an SRE configuration and then deploying the SRE.

By the way, I think that when you send responses and output by email, much of the content is being censored by your institute, because all I'm seeing is @.***, not any of the output.

If you intend to paste some output from the console into your replies, I'd suggest doing it directly on Github instead if at all possible.

mattwestby commented 6 days ago

Hi Matt,

Ah ok thanks, I wasn’t sure whether there was more for the SHM.

Could you send me an example sre config as I’m having problems with the validation of what I’m entering.

Thanks

Matt

From: Matt Craddock @.> Sent: 11 September 2024 15:30 To: alan-turing-institute/data-safe-haven @.> Cc: Matthew Westby (staff) @.>; Manual @.> Subject: Re: [alan-turing-institute/data-safe-haven] error with dsh shm deploy, azureclicrdential (Issue #2184)

Hi Matt,

Yes, the last two lines of the output from deploying the SHM should be something like

Ensured that DNS TXT record @ exists in zone green.develop.turingsafehaven.ac.uk.

Verified that domain green.develop.turingsafehaven.ac.uk is delegated to Azure.

After that, the SHM is deployed and you can move on to creating an SRE configuration and then deploying the SRE.

By the way, I think that when you send responses and output by email, much of the content is being censored by your institute, because all I'm seeing is @.***, not any of the output.

If you intend to paste some output from the console into your replies, I'd suggest doing it directly on Github instead if at all possible.

— Reply to this email directly, view it on GitHubhttps://github.com/alan-turing-institute/data-safe-haven/issues/2184#issuecomment-2343845656, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVWWA2PZBHN2H5TJ52RR5OLZWBHVVAVCNFSM6AAAAABN4A6PI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBTHA2DKNRVGY. You are receiving this because you are subscribed to this thread.Message ID: @.***>

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.

mattwestby commented 6 days ago

I've got to the point now where im starting to deploy the sre but i now get this error;

sre-deploy-error