microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.01k stars 399 forks source link

Suggestion: Let manual scale-in operation consider replica state #771

Open esbenbach opened 6 years ago

esbenbach commented 6 years ago

In a few instances I have had some stateless service where some of the replicas have been in a "bad" state (either with a warning or in-build or some such), and other replicas in a "Ready - OK" state, and I have had to do a manual scale-in. In almost all of the cases the "Ready - OK" replicas are removed, instead of the ones in warning/in-build state.

I am aware that I can remove the bad replicas specifically, and I am also aware that what i am doing is a bit strange (and I don't really expect to do this in production cluster but you never know) - however it would be nice if the scale-in could have a look at the state and start by removing replicas that are in bad shape.

masnider commented 6 years ago

What's probably happening is that the close/shutdown is getting delivered to each of the services however the ones that are unhealthy aren't responding. So instead the good ones get shut down (because they're working). This seems like a reasonable enhancement, however if the services aren't responding to APIs or are stuck in some other way then it's unlikely to work. Do you know why they were unhealthy in this case?

esbenbach commented 6 years ago

It was failing after service registration due to a db check causing an exeption so it kept crashing during startup.

Sounds plausible that it was not responding and therefore not "killed"

On Mon, 21 May 2018, 20:17 Matt, notifications@github.com wrote:

What's probably happening is that the close/shutdown is getting delivered to each of the services however the ones that are unhealthy aren't responding. So instead the good ones get shut down (because they're working). This seems like a reasonable enhancement, however if the services aren't responding to APIs or are stuck in some other way then it's unlikely to work. Do you know why they were unhealthy in this case?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Azure/service-fabric-issues/issues/1058#issuecomment-390738013, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUH_i3Vf6h7LFvM2HvE46TxZnWmVb5jks5t0wSogaJpZM4UAtiK .

masnider commented 6 years ago

So this seems reasonable however it will be quite tricky to implement since there's a chicken and egg problem: the services you want to shut down are often already not responding to API calls, are stuck, or are already crashing, making it hard to shut them down cleanly. We'll think about if there is some way to do this correctly in the future, but I have no design or timeframe for when or how this would work at this time.

abhishekram commented 4 years ago

@esbenbach , can you please clarify what happens to the scale-in in this scenario? If I'm understanding the issue correctly, I would expect it to get blocked. If so, we have made some improvements in Service Fabric 6.5 CU3 to address this. With these improvements, after a certain timeout period Service Fabric will forcibly take down a stateless service replica that is blocking its close API and thus will allow the scale-in to make progress. Assuming this was the problem that had been seeing previously, could you please check if Service Fabric 6.5CU3 (or later releases) addresses the problem as you would expect? Thanks!

esbenbach commented 3 years ago

@abhishekram I completely missed this because the response was so late. The exact behaviour back then was that the replica that was "OK" was being scaled-in (removed) while the ones in the "bad" state was kept.

We have since reduced the amount of work we do during startup so this type of issue rarely occurs for us any more.

Im guessing it could be easily reproduced by "faking" the perpetual "in build" state in every other instance or some such. And then do a scale in of course.

Might make more sense to just close this issue and we can re-open it again in case someone starts complaining :)