Open onionhammer opened 3 months ago
Hi @onionhammer
Thanks for reporting this issue. I am investigating.
May I ask what is the tool you are using at this picture?
Thanks!
@Greedygre that's an application insights availability test hitting an ASP.NET healthcheck
Hi @onionhammer We have tried but are not able to repro this issue by the same setup as you did. However, we do find the ACA environment you use for the testing. And it seems the network between web and one of the silo pods is unhealthy. The connection between web and silo should be long connection in theory, but we see the connection get interrupted many times. It seems to be a special case not a common issue in the Consumption workload profile. However, we will need more information (properly network capture) for root cause this.
Can you try to repro this issue again? If you can repro this issue, please keep the environment and app, then drop an email to "acasupport at microsoft dot com", we will investigate ASAP.
Hi @howang-ms
I've reproduced the issue and emailed acasupport - as you can see, the issue doesn't start happening immediately, but can take several hours to show up.
Hi @howang-ms any update?
This issue is a: (mark with an x)
Issue description
I have observed that when deploying Orleans to a cluster of 2 or more silos, client apps have a roughly 50/50 shot of being able to communicate with the target node.
Workload profiles:
Non-workload profiles
This seems to be related with https://github.com/microsoft/azure-container-apps/issues/721
Steps to reproduce
Expected behavior [What you expected to happen.] Orleans should work, by default
Actual behavior [What actually happened.] Orleans only works 50% of the time, the rest of the time grain/actor invocations timeout and clients are unable to communicate with silos
Screenshots
If applicable, add screenshots to help explain your problem.
In the above screenshot:
Additional context
This seems to be related with https://github.com/microsoft/azure-container-apps/issues/721
After running the repro for nearly a full day
Workload profiles:![image](https://github.com/microsoft/azure-container-apps/assets/969938/39ca69d0-da85-4086-b005-8ad5d5d96789)
No workload profiles:![image](https://github.com/microsoft/azure-container-apps/assets/969938/0aeb63fa-2eb3-4068-af5c-fc760bf5690e)