When scale_down is triggered, the Opensearch service is stopped and checked for cluster status. But if the scaling_manager's Fetch Metrics is trying to access OS for pushing the data at that point, it might PANIC and cause an application failure.
Suggested solutions:
Add retry mechanism for opensearch connection
Do not PANIC in FetchMetrics. Just log an error and continue
Description
When scale_down is triggered, the Opensearch service is stopped and checked for cluster status. But if the scaling_manager's Fetch Metrics is trying to access OS for pushing the data at that point, it might PANIC and cause an application failure.
Suggested solutions: