Closed pazdedav closed 1 year ago
Hey @pazdedav,
Thanks for raising.
Couple of questions if you don't mind
c0602d84-4705-405c-a897-0c4718407f9b
is the user trying to create the VNET and would be the app/workload owner for example?Microsoft.Network/ddosProtectionPlans/read
- so they can see it at least and create support tickets against it if required?Microsoft.Network/ddosProtectionPlans/join/action
- so they can perform the join action for their VNETs?Let me know your thoughts
Cheers
Jack
Hi @jtracey93:
objectID
from the screenshot/error message was the user owning the LZ with standard Contributor role in that subscription (but no permissions on the Connectivity sub).Thanks @pazdedav
On the DINE side as it's a post deployment operation rather than part of the initial create operations (which happens as the user making the request) then the managed identity associated to the DINE policy would be the only identity that needs permissions.
You could also do the same for modify as a remediation task as they work the same as DINE. But the issue here is that modify is trying to modify the PUT request as the user makes it which means it's seen as the user making the PUT rather than a managed identity (as would be used if it was a remediation task).
Hence my suggestion around DINE as would always happen post VNET creation as the managed identity which could be granted permissions.
On the custom RBAC though I think this could be easily done with groups to assign all LZ owners too. One would likely already exist for potential communications around the companies azure platform for example.
Thoughts with these pieces of additional information?
Thanks for additional input.
I misunderstood you first, but I got it now. You would like the PG to create a new built-in DINE policy that would enable a centralized DDoS plan, so we could remove the assignment of current Modify policy that is blocking users from even creating VNets. I wonder if we are dependent on the PG here or if this could be done by the ALZ team in the same way that all other ALZ policies were created ;)
Our subscription vending process is creating new AAD groups and assigns the requestor as owner (to manage group membership), so we would either need to do an extra role assignment of that custom role for the plan as part of the workflow or we would need to add this group to some top-level group that would have that role assignment done in advance. We (the platform team) don't know our customers/users in advance, unless we want to create some dynamic AAD group for essentially all (Azure) users in the company.
I thought for a while that alternatively, we could have an event-driven automation that would add each new VNet to the plan asynchronously (not by using a policy but with Event Grid + Functions), but this wouldn't work, since every change of VNet properties done by the user would still require the join
permission on that plan. Just thinking about it, wouldn't this requirement mean that the DINE policy alone won't work? Even if it adds the DDoS plan property after the initial request passes, any subsequent VNet modification by a user would fail due to the lack of that join
permission?
The DINE policy would run all its activities as its own identity so as long as that identity alone is granted the join
action on the DDoS plan and has permissions on the VNETs we are all good. Nothing would run as the user with DINE.
And you are right, we could investigate doing it ourselves in ALZ, but we try to avoid where possible as built-in makes life easier for all involved (us, customers, etc.) - (I hinted towards this in my first reply π)
@pazdedav let me know your thoughts and if we should look into the DINE route in your opinion?
Thank you @jtracey93. I am however still struggling to see, how the "DINE route" alone would work without assigning a custom role with join
permissions to users.
This is how I envision this workflow would work (or not):
Microsoft.Network/ddosProtectionPlans/join/action
action permission. as documented here. The user trying to make that change does not have that permission and the DINE policy would not be applied (since the DDoS property already exist on the VNet resource).Perhaps I misunderstood that requirement from the DDoS documentation, but I guess we would know for sure once we have that DINE policy in a test environment :)
Another interesting question is, what would that DINE policy deploy? Both the DDoS plan and the VNet would already exist.
Ah that is a very important point i glazed over @pazdedav. You are indeed correct about the note of:
Therefore, DINE would work initially but subsequent updates to the VNET would be an issue I suspect from that note (something to test though)
It'll all depend on how the VNET RP handles a partial put of just the DDoS plan property.
As you say one to test. Do you fancy testing this out and reporting back? (fine if not, but thought you may be interested?)
I could try to test a scenario (without having a DINE policy in place), where I would have two users:
What do you think about my last comment, @jtracey93 ?
Another interesting question is, what would that DINE policy deploy? Both the DDoS plan and the VNet would already exist.
Even if we decide to create a new DINE policy, it is not clear to me what resource it would deploy, since both components would be in place already.
@pazdedav If you can test my suggestion 2 (that I shared above here: https://github.com/Azure/Enterprise-Scale/issues/1053#issuecomment-1254131060) around RBAC that would be great.
At least thats a workaround validated π
As for the DINE stuff it will be the VNET that we would do a PUT on to add the DDoS plan to the VNET. But this would also need testing. Hence my comment around the VNET RP and wondering how it would behave. We can test that next π
@pazdedav How did the testing go?
Hi @jtracey93
I was able to test this policy today and here are my findings:
So, we can:
For customers, that have a central team managing the VNETs adding DDOS Plan might not be an issue, however for customer going with subscription democratisation, they would not want to give this access to application users.
Thanks @aparnabhat-gh, for the work and detail here, much appreciated.
We discussed as a team on our weekly policy scrum and have some things we'd like to investigate before committing to a direction forward here and/or reach out to PGs for DDoS. The action plan is listed below:
We will update this issue with our findings and next steps
Mainly due to this known issue/behaviour on the vNet resource:
I personally suspect a DINE policy may not be possible here due to it also amending some additional properties, unless we create a complex policy to union a lot of the properties to ensure they aren't overridden. Hence the need for testing heavily.
cc: @paulgrimley & @Springstone for as we discussed for creation of a user story and assignment for testing etc.
Story created for policy refresh planning AB#30269
We have confirmed that using DINE is destructive and potentially dangerous to use to apply changes to a virtual network, (e.g. enabling only a DDOS plan on a virtual network, will reset all other parameters incl. address space, subnets, etc). We have escalated the issue to PG, but have no updates as yet.
We've tried multiple paths to ask engineering to improve this experience and provide additional features/mechanisms to enable the scenario you've highlighted, but unfortunately our feedback is not being prioritized. We'll keep up the pressure, but in the short term we are dependent on engineering enhancements. You can help by raising support tickets for the same issue, having customer evidence helps move things along.
As ALZ team cannot improve the experience directly, will be closing the issue. Feel free to reopen or open a new one should you have further input.
Describe the bug When a workload owner of an Online LZ subscription wants to create a new Virtual Network, they get an error similar to this:
I assume this error is caused by the fact that ALZ reference implementation (accelerators) is using a centralized DDoS protection plan deployed to the Connectivity subscription, where a regular workload owner doesn't have any access. The requirement for such permission is documented here.
How can this issue be addressed? The obvious option is to grant
Microsoft.Network/ddosProtectionPlans/join/action
permission (e.g., by creating a custom role) for the central DDoS Plan to all workload owners / users but this might not scale well, since people will come and go.Screenshots