Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.44k stars 1.07k forks source link

Graylog federation / multi home #3969

Open jalogisch opened 7 years ago

jalogisch commented 7 years ago

Expected Behavior

With the move to the Elasticsearch REST Interface since Graylog 2.3 it should be possible to configure multiple elasticsearch cluster to lookup data from. Even if they are located in different locations.

Even better would be to talk to the Graylog API to get only the results back from remote.

pilotmultiplegraylog

Current Behavior

if you want to have some kind of multi homed Graylog Setup you need to have a Graylog Cluster running on every location and forward all (wanted) logs to a central Graylog to work with that data.

Context

That is similar to https://github.com/Graylog2/graylog2-server/issues/1004 and the main idea to build a federation. But as we might be able to talk to different elasticsearch cluster that might be true for the Graylog API too.

That way messages did not need to be duplicated and transported from the island into the main graylog.

billmurrin commented 7 years ago

Tribe node support? I have a use case where I want a GL server in different data centers, but because of latency, these servers cannot be in the same ES cluster. The current GL architecture means I have 3 different servers deployed. As your idea suggests, we can have a central server that can query the remote servers and then roll-up the results and present a singular view to the user.

joschi commented 7 years ago

@billmurrin The Elasticsearch tribe node is on its way out (see Tribe Nodes & Cross-Cluster Search: The Future of Federated Search in Elasticsearch), so we won't use that for implementing any sort of federation or Multi-DC support. 😉

billmurrin commented 7 years ago

BTW @joschi that was a great article. Thanks for sharing.

fredericve commented 6 years ago

This would be a very useful feature for us as well. We have multiple DC locations, each with their own ES and GL instance. We also have one 'parent' DC in which it would be useful to be able to display the information of all other DC's. Right now our ops team needs to connect to different web interfaces to find the correct information.

ES tribe node is indeed on its way out, but it's replaced by Cross-Cluster Search. This seems like the feature needed in ES to be able to implement it.

provonet commented 6 years ago

are there any plans in supporting cross-cluster search ?

heclemen commented 5 years ago

+1 for cross-cluster search support

ethhack commented 5 years ago

This would be something extremely useful and would put Graylog in a place to contend with Splunk's Search Peers - Distributed Search. Can someone please provide an update on if / when this functionality might be implemented, as it is a key factor in my decision to migrate to Graylog or remain on Splunk, as it minimizes the 'normal' data congestion across the wire for our large scale, very wide-spread PCI environment.

zez3 commented 4 years ago

If I understand this correctly then there is nothing to do from the graylog side: https://www.elastic.co/blog/tribe-nodes-and-cross-cluster-search-the-future-of-federated-search-in-elasticsearch "From a search execution perspective, there is no difference between local indices and indices that belong to remote clusters as long as the coordinating node can reach some nodes belonging to the remote clusters. Finally, the hits returned as part of the search response which belong to remote clusters have their index name prefixed with their cluster alias. " Your ES Cluster dictates where to search and returns the data This should be pretty easy to test with an extra ES Cluster

ethhack commented 4 years ago

zez3-

Not convinced it’s that simple with Graylog. Pretty sure I’d tried before, and the inability is because of the way Graylog does it’s indexing under the covers. If you test and find I’m incorrect, I’m all ears. ;-)

You CAN do cross-index searches with Kibana against multiple Elasticsearch clusters, even those with Graylog data. So there are options.

I was just unable to do it natively, from within Graylog, where I’d prefer to use some plugins and correlations I already have defined.

(Again, all of this assumes Graylog free, not Enterprise)

ethhack commented 4 years ago

I’ll do some testing again tomorrow, but I’d also spoken directly to Graylog’s team and they’d said it doesn’t work, currently, although it’s somewhere down the line on their roadmap.

jamallmahmoudi commented 3 years ago

### exactly we can have a central server that can query the remote servers

juanjo-vlc commented 3 years ago

I'm sure it all depends of elasticsearch's ability to list the indices from remote clusters, because graylog mantains a relationship between time ranges and indices, and performs the search on the indices matching the requested timespan. I'll try to test it next weekend because we'll be in the same position on the near future.

juanjo-vlc commented 3 years ago

I'll try to test it next weekend

I've been performing some tests and federated searches required a lot of work, Indices API is not returning indices on remote clusters, so graylog should be in charge of quering remote clusters and maintaining also the ranges of indices on the remote clusters, then it has to include the remote cluster alias on the query. In order to recover the indices' ranges graylog needs access to the remote cluster's http port.