Closed qinxgit closed 1 year ago
Hi @qinxgit
We have decided to no longer maintain this project. We have moved on from Service Fabric to Dapr in Azure App Service. If your organization, would like to do the same, let us know as we may be able to help. Please fork the project if you would like to make changes to it.
I have a similar issue perf when operating this service in production. Sometimes the GetCachedItemAsync operation takes 10 times the 90th percentile of the operation. I have 5 partitions, and each has 3 replicas. Sometimes it stays that way for a couple of minutes and causing timeout in my other services and my service's availability drops. And my service has HasPersistentState = false.
I have 2 suspects:
The distributedCacheLocator.GetCacheStoreProxy use a dictionary to cache the proxy, which never expires. Over time, the service fabric runtime might move the primary replica of the cache store to different node, it might cause a issue, or not? Not sure.
I suspected that if all the Get and Set operation is on one single replica( the primary replica) it will becomes a bottleneck. If other services on the primary replica is consuming CPU, it would slow down the operations.
My scenario is: We have a lot of read operations, but write operations are not very frequent. So I figured that make the secondary replica available for read will ease this problem. I tried but failed because the GetCachedItemAsync, to my surprise, includes a SetCachedItemAsync call, which is a write and cannot be performed on secondary. And the GetCachedItemAsync is not virtual, so I cannot change this behavior. Is there any way to implement this? Or add an option? Or making GetCacheItemAsync virtual?