Open heaths opened 1 year ago
/cc @benbp @joshlove-msft @christothes @jsquire
@heaths, to "fix" this would you then do something special if you detected you were using the China cloud (ie, tune retry timeouts?).
I honestly don't know if that would help. Maybe mitigate some, but it's an arms race at that point. If we can run agents closer to the their cloud, that would be best.
@heaths, is that the approach you were outlining here? (it mentions an offline discussion)
Discussing this offline, we could expose some variable in cloud-specific files
(also, agreed that moving our tests to run inside the cloud or nearer is the right option)
@benbp is the mastermind here. I might've been eluding to some way to say "use a longer timeout", but Ben was going to see if we could run agents closer to their cloud to mitigate the high latency that is likely causing this.
@heaths We could spin up a southeastasia
agent pool. I would need to do a bit of yaml plumbing first to get our cloud configs to target agent VM regions. CC @mikeharder
Any way we could test that it would make a meaningful difference before doing all that work? Could you or I (happy to help) make some changes in a PR that would force it and just test those against China's cloud (remove the others, for example)?
@heaths the easiest way to test would be:
Because our test pipeline agents run from somewhere in the US and do, perhaps, to some other issues, many of our tests across libraries and languages are timing out against the China cloud e.g., Azure/azure-sdk-for-net#34641 (manual test run, but indicative of tests-weekly runs).
Discussing this offline, we could expose some variable in cloud-specific files e.g., https://github.com/Azure/azure-sdk-tools/blob/main/eng/common/TestResources/clouds/AzureChinaCloud.json, that get plumbed through to clients.