Closed fearful-symmetry closed 2 years ago
Pinging @elastic/integrations (Team:Integrations)
Pinging @elastic/security-external-integrations (Team:Security-External Integrations)
Relates to #663, but we don't have a firm target yet. So if there's an alternative that unblocks you it would be worth considering.
The size of the content in question is about 110 KiB uncompressed.
I think one of the problems here is also that we ship it down to ALL Agents, independent if the Elastic Agent is running on Windows or not. So if a user has 10'000 Linux machines, 10k*100KiB traffic is used for each change even though for all the Elastic Agents it is not relevant.
The first thing that came to my mind when I read packaging with Filebeat was lightweight modules. It goes a bit against the part that we want to get all the logic out of the Beats but maybe a temporary middle step?
Before we add something like lightweight modules I'd like to look into moving parts of the javascript code to ingest pipeline. A quick look shows that many of the lines of code are taken up with mapping things like event codes to textual representations. I estimate we could go from 110KiB to around 38KiB, just by moving these tables and lookups. This is part of the work we would be doing for #663, but maybe by doing this first it would save enough space to give us time to finish #663 without lightweight modules or other options of splitting packaging.
I know we've talked briefly about it in the past, but this might be a good time to bring up potentially running any javascript through a minifier as part of the package compilation process. Just a quick test of running the script through an online minifier + a gzip -9
clocked that script at 13250
bytes--so roughly 1/9 the size. So to @ruflin's point, just doing that alone would save ~900MB worth of bandwidth for a 10k agent deploy. Regardless of how quickly we move stuff to ingest pipelines, I think we should still consider the whole minification route, doesn't really make sense to me to have to transfer pretty scripts complete with comments over the wire for no reason.
+1 for the minifier. Just doing minification no gzip brought that script down to 65KiB. For compression, any reason not to do that for the whole policy?
I like the idea of the minifier. At what stage of the development would that happen?
Also a fan of the proposal from @leehinman to move the lookup tables first as it has a major impact.
For the compression of the policy, I think this is something we should look into anyways but independent from this issue (if we aren't already doing it). @nchaulet ?
I like the idea of the minifier. At what stage of the development would that happen?
Unrelated to this specific use case (JS assets minification), but along similar lines, I've been discussing with @mtojek about using the elastic-package build
step to produce "compiled" packages. Today this process is mostly a no-op with a couple of exceptions:
docs/README.md
file from the _dev/build/docs/README.md
file.**/_dev
folders from the "compiled" package contents since those are meant to contain development-time-only files, e.g. test files.We could add minification of JS assets as another step in addition to the above steps.
As always, this is something that would be automatically run in the integrations
repo as part of CI. But for package authors who maintain their packages outside of the integrations
repo, they'll need to take care to run elastic-package build
before bringing their package into the package-storage
repo (side note: this doesn't seem to be the case today with some packages).
Definite +1 on the minifier. Shouldn't be too hard to implement?
For the compression the kibana response should be gzipped, it's maybe something we want to implement in Fleet Server too
@blakerouse @urso Should we have a config option in the fleet-server if the policies are shipped down compressed or not?
Keep in mind that minification makes the debugging harder (there won't be any maps that can uncompress the content) and we'll face many errors similar to this: Object.getClass undefined in line 1
(everything in line 1 :)).
If this is only about the network traffic, I believe we should tune the protocol between Kibana and agents (enable gzip, deflate, GRPC compression, etc.) and we should gain similar results.
@ruflin I don't think its something that would need to be configureable. I think returning a gzip compressed result from a HTTP response is all that is needed.
I don't think its something that would need to be configureable. I think returning a gzip compressed result from a HTTP response is all that is needed.
+1.
Keep in mind that minification makes the debugging harder (there won't be any maps that can uncompress the content) and we'll face many errors similar to this: Object.getClass undefined in line 1 (everything in line 1 :)).
Good point. Not sure how good the library we use is about error reporting, but it is a good reason to not minify the script. I'd rather opt for compression only. No minifier, being able to "debug" issues based on stack-traces/logs is a FEATURE we must keep functioning.
Do we have other integrations with this much JavaScript? If possible we want to get rid of the JavaScript here, but have we considered to handle the JavaScript as some extra artifact that can be downloaded separately (in order to keep the policy "small")?
Just to offer some numbers on the gains of minification vs. compression vs. both, I took the JS from the file linked to in the issue description and ran it through https://javascript-minifier.com/, through gzip -9
, and then through both. Here are the results.
File | Size (bytes) | Size reduction |
---|---|---|
script.js (original, raw JS file) |
112915 | 0% |
script.min.js (minified) |
67223 | 40.46% |
script.js.gz (compressed) |
20727 | 81.64% |
script.min.js.gz (minified + compressed) |
16421 | 85.45% |
Obviously, minified + compressed gives us the most size reduction but just compressed is not that far off. Given the benefit of not minifying (easier to debug), I think just compression is good enough?
Given the benefit of not minifying (easier to debug), I think just compression is good enough?
+1
++, great to have the numbers.
Hi! We just realized that we haven't looked into this issue in a while. We're sorry! We're labeling this issue as Stale
to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1
. Thank you for your contribution!
Closing as Windows events are now leveraging ingest pipelines (since 8.0) and no longer rely on the local Javascript processing.
This was a concern originally raised by @ruflin , but I'm making the issue.
With 7.11, the system module comes packaged with the windows logging data streams, and they're enabled by default. Problem is, a lot of them are pretty large. Security, for example: https://github.com/elastic/integrations/blob/master/packages/system/data_stream/security/agent/stream/winlog.yml.hbs
Most of this isn't config at all, it's javascript:
For the sake of scalability, we need to find another place to put these, and presumably package them with agent/winlogbeat, and not with the config itself, since we don't want to have kibana sending out a huge flood of JS every time a user updates a config across a large cluster. We're already adding to these scripts, and it's probably not sustainable in the long term. It also doesn't help debugging, since we sometimes need users to provide us with a "rendered" agent config, which ends up being thousands of lines long, since all this is packaged into it. Is there a way we can package this JS with winlog beat or agent?
CC @narph @urso