Open balintnadasi opened 8 months ago
Hi,
I would also be glad to see a DSL-Backend. In the long term we consider switching to EQL or ES|QL, however at the moment we are still using the sigmac
converter with some customizations and dsl output format. A DSL backend for pysigma
would enable us to continue using aggregations/correlation queries while making it relatively easiy to compare the output from sigmac
with the new output of pysigma
to ensure that the searches are still working as expected. Additionally, as long as ES|QL is still in technical preview, we probably will avoid using this backend for productive use.
If you also still think that the dsl backend is a useful feature, I could offer to start working on a new DSL Backend, since I don´t see a dsl
-branch in this repo and also not in the forks indicating somebody is already working on this. If you have already started I can also maybe help with testing :)
I didn't started a DSL backend and I don't know about anyone who started. So feel free 😉
Hello @Mat0vu !
I saw that you forked the project and I would be happy to help with my unit tests. Where can I get the JSONQueryBackend?
Hi @balintnadasi ,
sorry for the late response. I´ve just updated my fork where I´ve been working on the implementation of a DSL backend for Elasticsearch.
Since the DSL Language is using json-queries in contrast to EQL or ESQL and because it was difficult to get the desired output using the variables provided by the TextQueryBackend
which then passes the data to the python str.format() function, I´ve decided to create a new class JsonQueryBackend
which could have been included into the base.py
class.
However, in the end the code of my new JsonQueryBackend
was almost identical to the TextQueryBackend
with only a few adjustments. That´s why I´ve thought that it is probably a better way to switch back to TextQueryBackend
.
Now the DSLBackend
is based on the TextQueryBackend
again and overwrites some functions completely (especially to get working correlation rules), which I could not get to work with json and the various str.format() calls within the default superclass.
So far I´ve managed to implement (hopefully) most of the basic use cases:
composite
aggregations.Currently not supported (only the stuff I know of, so probably not complete):
correlation_search_multi_rule
is not implemented yet)I´m not a specialist regarding Elastic-Mappings, and all of the fields we are searching in are mapped as keyword
fields, for which regex and term queries work well. However, searching in text-fields might require some changes to the search type (e.g. match
-search)...
I would say the Backend is far from finished but the current status seemed to be working fine when translating some of our existing rules and comparing the hits with the rules that were translated with sigmac
. Aggregations also seemed to be working fine.
I will not be able to continue working on this topic for the next few weeks and because ES|QL is going to be fully supported by Elatic >=8.14 we are currently considering to switch to the new language. Anyways, you are very welcome to add unit_tests and improve the code :) If I find time, I will also try to continue working on this, however this won´t be possible in the next few weeks...
@balintnadasi / @Mat0vu Is this still an issue and would you like to prepare a pull request for ES-DSL in the future or has EQL/ES-QL successful overwritten the need?
Hi @andurin, because my team is currently switching to ESQL, we do not need DSL support anymore. If @balintnadasi or anyone else still wants DSL, they can use the code from here as a starting point.
Hello @andurin !
I revised @Mat0vu 's code to get a query that approximates the old sigmac (unfortunately, regex filters seemed slower in some cases). For now, I’m facing some escaping issues, and I'm working on resolving them. If all goes well (and I will have some time before xmas), I hope to create the merge request in December.
Hi guys!
Is there any chance that this backend will support pure DSL query generation in the near future?