opensearch-project / security-analytics

Security Analytics enables users for detecting security threats on their security event log data. It will also allow them to modify/tailor the pre-packaged solution.
Apache License 2.0
72 stars 74 forks source link

[BUG] Cluster dead because detector shard stuck initializing #1235

Open mvanderlee opened 4 months ago

mvanderlee commented 4 months ago

What is the bug? Upgraded a cluster from 2.11.1 to 2.15.0 and the cluster is in red status because the .opensearch-sap-network-detectors-queries-000007 shard is stuck initializing.

{
    'current_node': {
        'attributes': {'shard_indexing_pressure_enabled': 'true'},
        'id': 'zLNHMkpeReSxfbVn99y1Sw',
        'name': 'opensearch-node1',
        'transport_address': '172.19.0.3:9300',
    },
    'current_state': 'initializing',
    'explanation': 'the shard is in the process of initializing on node [opensearch-node1], wait until initialization has completed',
    'index': '.opensearch-sap-network-detectors-queries-000007',
    'primary': True,
    'shard': 0,
    'unassigned_info': {
        'at': '2024-07-24T12:00:27.551Z',
        'last_allocation_status': 'throttled',
        'reason': 'CLUSTER_RECOVERED',
    },
}

How can one reproduce the bug? Steps to reproduce the behavior: No idea. Create cluster at 2.11, add detection rules, upgrade to 2.15 and observe error.

What is the expected behavior? My cluster to not die because a stupid feature can't start. If anomaly detection is broken, then only let that feature be broken, not my entire cluster!!!!!! FFS Separate user indices from system indices. The fact that this isn't done and that they are treated identical is a super stupid decision.

What is your host/environment?

Do you have any screenshots? If applicable, add screenshots to help explain your problem.

Do you have any additional context? Add any other context about the problem.

kaituo commented 3 months ago

@mvanderlee For detector, are you referring to https://opensearch.org/docs/latest/security-analytics/sec-analytics-config/detectors-config/ ?

mvanderlee commented 3 months ago

@kaituo That's right.

I can't tell you the exact config because we've stopped using Detectors and created our own alerting system

kaituo commented 3 months ago

@opensearch-project/admin -- Can we please move this to https://github.com/opensearch-project/security-analytics ?

dblock commented 3 months ago

[Catch All Triage - 1, 2, 3]