Velocidex / velociraptor

Digging Deeper....
https://docs.velociraptor.app/
Other
3k stars 492 forks source link

Support elastic 8 #1714

Open scudette opened 2 years ago

scudette commented 2 years ago

Elastic 8 breaks support for the _type field.

https://www.elastic.co/guide/en/elasticsearch/reference/7.17/removal-of-types.html

I think we just need to convert our use of _type to just type but I need to verify it with an elastic 8 install.

scudette commented 2 years ago

Should be fixed by 0.6.4

scudette commented 2 years ago

This is actually more problematic because I have found out recently that the Elastic client deliberately breaks support with the opensearch client (https://aws.amazon.com/blogs/opensource/keeping-clients-of-opensearch-and-elasticsearch-compatible-with-open-source/)

This does not affect us yet because we use an older version of the elastic golang client (prior to the check being added) so it still kind of works. But likely when we update our client library Elastic 8 support will be completely gone. To keep both we will need to include two clients and compile them both in - this is not ideal since each client library is huge and increases our size significantly.

I am inclined to drop support for the Elastic version of elastic and just go with the open search going forward. This seems more compatible with open source projects and we have to pick one.

I am opening this issue for further discussions.

mpilking commented 2 years ago

Based on the AWS article you linked to, it looks like they are now producing clients that are intended to be backward and forward compatible. Would using their golang client be an option?

Per the article:

To give these users a clear path forward, the OpenSearch project will add a set of new open source clients that make it easy to connect applications to any OpenSearch or Elasticsearch cluster. These clients will be derived from the last compatible versions of corresponding Elastic-maintained clients before product checks were added. In the spirit of openness and interoperability, we will make reasonable efforts to maintain compatibility with all Elasticsearch distributions, even those produced by Elastic. These clients will let developers continue running their current version of OpenSearch or Elasticsearch with minimal changes to their application code. The new clients will offer the same APIs and functionality they use today.

predictiple commented 2 years ago

An option that might be unpopular and/or unpleasant to think about is to offload the hassle of supporting various data backends to Fluent Bit (or similar log shipper). It's a dedicated and super-fast data shipper, and they maintain support for Elastic, Opensearch and many more: https://docs.fluentbit.io/manual/pipeline/outputs

In that case we just need to decide on the best way to send to Fluent Bit, and then maybe some other clients could be axed from Veloci too. The downside would be loss of bragging rights because for marketing purposes it's nice to be able to say "we support A-Z!" but in reality it's a lot of maintenance work. In the long term should Veloci really be carrying it's own set of libraries for many data backends or should it rather support a few dedicated data shippers?

weslambert commented 2 years ago

I'd be hesitant to drop support for Elastic completely, but that's probably because I use it a bit, especially with Security Onion. 😄 Similar to Fluent Bit, additional options might be integration with Kafka or Redis (maybe even an option to do so locally) to allow either to pull from them. Although, you might be able to just get away with the http client plugin.

predictiple commented 2 years ago

Elastic is at the popular end of the scale, and the bulky end of the scale. Google Pub/Sub probably not so much in either aspect. But the argument I'm making is that maybe it shouldn't be Veloci's burden to carry any of them? Permanent installations can afford to have a separate app for data shipping. Transient instances are unlikely to link up to external backends. A dedicated data shipper gets you Kafka, Redis etc. support for free as well as tons of other capabilities that go beyond a simple client library (e.g. data transformations/pipelines, monitoring, fault-tolerance). Obviously with such an approach there would need to be a phase-out period.

predictiple commented 2 years ago

Keep Veloci "lean & mean" is what I'm saying. To quote Elon's 2nd Law: If you're not occasionally adding things back-in you're not deleting enough, the bias tends to be towards "let's add this in, in case we need it" :smiley:

scudette commented 2 years ago

I know that the AWS article states that their client library will be forward compatible with elastic but that was not my experience when I tried it. It actually gets even more complicated because opensearch now has two different client libraries for os version 1 and os version 2 and they are not even compatible at all .

There are a number of options:

  1. Support all those via different plugins
  2. Make the plugins optional so they are not built in but they can be if needed (we can distribute a "sumo" build with everything enabled)
  3. Maintain an extra fork of the client libraries that does just what's needed and remove the overheads. That's what we do now because the regular elastic client is very heavy and not designed for size efficiency.

Most of those external tools use simple upload protocol like http or rest so the client libraries should be simple, but they are not generally designed well so they contain a lot of bloat that the compiler can not optimize.

I am not in favor of delegating to an external uploader because that just makes it even more complicated to deploy with another dependency.

Maybe a sumo build is the way forward.

weslambert commented 2 years ago

I like the idea of a "sumo" build, or at least a way to enable what you would like. 👍