Open greycel opened 7 months ago
Is that 225,000 events per day? Thats not overly large but what I think is happening is you aren't signaturing your events correctly resulting in a large number of signatures over your search window. Can you run the following via dev tools? If the total number of unique signatures is above 10000 you're not signaturing correctly and deduplicating your events. Non signaturing correctly will also allow duplicates of the same event to come in which can result in Event explosion. I would also recommend getting to the latest release.
GET reflex-events/_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "status.name: New"
}
},
{
"range": {
"created_at": {
"gte": "now-7d"
}
}
}
]
}
},
"aggs": {
"unique_sigs": {
"cardinality": {
"field": "signature"
}
}
},
"size": 0
}
Output for given query - unique signatures count "144521",
Signatures are used for deduplicating events and events get loaded properly if the number of signatures in a search window is lower than 10000. Is my understanding correct..
Signature fields configured for one of the organization's input (Windows) "host_name, user_domain, user_name, process_name, process_executable, event_id, event_channel, event_category"
Since Default Org's admin account can view and load events from multiple organizations, is there any change where the number of signatures can go above 10000 and fail to load when quering events from default Org's admin login (even after its correctly singnatured)..?
There is no way to change this in Reflex, it is an opensearch/elasticsearch default max bucket count setting but you could change it in the index settings manually. Depending on the version you are on, we did rework some of the code related to this that might solve the issue, I recommend upgrading and seeing if the issue persists. As a note, we have yet to experience this issue internally and we have 40+ sub-tenants and default org can see them all just fine.
I'm going to second @n3tsurge comment. You are likely encountering a bug that is fixed in a newer version. For example, we have 301,328 unique signatures in the last 7 days and we can view events across all clients from the Default Organization page.
Hi Team,
1) Event Queue Issue retrieving a large dataset We've been receiving about 2,25,000 alerts per day roughly from different log sources and last checked there were 7,45,000 alerts for the last 7 days. For the last 10-15day the event queue page has been taking time to load and sometimes not loading eventually throwing
opensearchpy.exceptions.TransportError: TransportError(503, 'search_phase_execution_exception')
below is the error log from the API service.2) After restarting all the reflex docker services, the Memcached docker container state is displaying as unhealthy and I've been receiving Memcached problems in the "reflex-api" service logs. service is reachable and connected when checked with "nc localhost 11211". I'm not sure which reflex components will be affected by this, need help.