microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.03k stars 401 forks source link

Load balancing/shifting of services between nodes #783

Open usamaazhar opened 6 years ago

usamaazhar commented 6 years ago

We are working on a service fabric application architecture where there are multiple instances of Guest executables( NodeJS) service running. As per default behaviour of SF it shifts services from one node to another when the load is high for balancing the load. The problem we are facing is that these guest exes have large file associated with them which they download as a part of initialization of the exe. Now when the service moves to another node its file doesn't and that becomes a very big problem for us since the file is critical. Is there a way to move the file when the service moves. Or any recommendation to address this problem

raunakpandya commented 6 years ago

Couple of options.. Depending on whether its read only file or read-write.. If its read only, you can pre download the file on all nodes using VM extension or service fabric setup entry point, that way when the services move, the file is already present. If its read-write, you would need to ensure that the file is available at a shared location accessible to all the nodes where the service can move. One option could be to use Azure file share. Going a step further, you can containerize your application and use the new Azure File docker volume driver https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-containers-volume-logging-drivers. In future we are also going to come up with a SF Block store volume driver where the volume exposed to your container will be replicated along with your service. :)

usamaazhar commented 6 years ago

@raunakpandya we are trying to incorporate azure file share but that would introduce unnecessary delays to our workflow. Predownloading on all nodes would not be very suitable in our case because the scale of our application is large and we could potentially run 1000s of instances which means we are eating up a lot of space if there are (totalNodes)*(instances) number of files.