Open MatthewWilkes opened 2 weeks ago
@MatthewWilkes
Hello @chinadragon0515, thanks for your message. I've updated the diagram with some extra information:
To answer your questions:
debug-mw-1-vnet
virtual network, which is completely isolated from the application environments. 10.0.0.4 is the IP for the debug-mw-1
virtual machine in that vnet. Those are the items listed in the tcpdump capture.snet-infrastructure
subnet of the application vnet. We can access this directly from a virtual machine in the snet-management
subnet of the application vnet. Confusingly, this happens to also have the IP 10.0.0.4, but all the packet dumps I shared above reference debug-mw-1
, not this VM.
This issue is a: (mark with an x)
Issue description
On two of our six ACA environments that are offered through a private link service, we had simultaneous elevated TCP connection failures.
This is an issue we explored with Azure support, which we initially thought was a FrontDoor failure, but we have since learnt was a failure related to the combination of private link and ACA apps.
Azure support was unable to resolve this problem, we have mitigated it by redeploying these environments in their entirety, however we do not know what the original cause was.
Steps to reproduce
We are not aware of any way to reproduce this, however we have two environments that exhibit this behaviour that have not yet been removed. We will remove these towards the end of next week, if you would like a chance to examine them before that, please contact me.
Expected behavior Connections work through private link reliably. An example connection session is:
Actual behavior SYN packets were not responded to with SYN/ACK, but intermittently with RST/ACK.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Please see this diagram of connections that work and do not. The green lines imply good connectivity, the red poor. The VM at the bottom of the diagram is within the same vnet as the ACA app, so accesses it directly, the one at the top is in a different VNET.
Replacing the private link service and the private endpoint was not sufficient to restore service, which implies this is not a private link issue, but an issue with the combination of the ACA and the pl service.
Only upon replacing the two ACA envs did the problem end.