dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.07k stars 2.03k forks source link

Timeout in simple grain call #6529

Closed lindblom closed 4 years ago

lindblom commented 4 years ago

I get this message and I do not understand why. The method i'm calling is deserializing some json, updates its state and sends an event off to even hubs. This works and the system does about 80 of these calls per second. But then it just stops.

image

This is part of an import process where the import comes in as messages over a stream. These events are picked off the stream and saved to grain state where they are being worked on by a timer. This grain then routes these messages to the correct grains in batches of 10.

sergeybykov commented 4 years ago

Sorry about delay response. Have you figured it out already? Is/was there maybe a call cycle that creates a deadlock?

lindblom commented 4 years ago

I could have been. I gave up on using orleans after spending way to many day and night on getting trying to get it stable and not lose events/data. Since we had launched we had no more time to spend on this and it was hard since i couldnt see anything wrong with application code. My collegue took it over from me and rewrote the application in a way that he thought he had fixed it, but then when we put it into the prod envoirnment it also locked up and lost data. Besides having the lost data and hangs we also burned through close to a 500 USD hosting bill where 300 was azure storage requests for all the grain data we generated while running the import into the system.

In the end i converted most of the application to a asp.net core + hangfire and mongodb solution, will convert the rest of the features as they are required. This new solution is much cheaper to run and its also much faster than what we had before in orleans. But i do have to think about concurrency in regards to data in another way, which was nice to not have to think about while using orleans.

All and all im a bit sad that it didnt work out. I do like what you are doing alot and i like the idea of having this auto healing and loadbalanced virtual actor framework.

sergeybykov commented 4 years ago

Sorry to hear that. Setting up streaming event processing correctly can be tricky and more complicated than we would like it to be. Let us know if you get back to it in the future. We might be able to help.