Closed jsoriano closed 2 months ago
with a dependency to the integration package that collects the data
Just the ability to define such a dependency would be a very useful feature on its own. We have a similar use case in APM, where we would like a Java attacher integration to be able to declare a dependency on the APM integration.
cc @eyalkoren @Mpdreamz
Java attacher integration
What kind of content would these attacher integrations contain? For this proposal I am excluding in principle anything related to data ingestion.
Also, we don't want to open the door to dependencies in a general sense, the dependencies here would be only between the content package and the package it extends.
In our case it would be a binary that elastic-agent needs to supervise. See https://github.com/elastic/ingest-dev/issues/982 for a discussion on why we feel this responsibility should not fall on the main package.
Also, we don't want to open the door to dependencies in a general sense, the dependencies here would be only between the content package and the package it extends.
++ fully agree, package management is hard :smile:
That's why the design doc our feature request around this topic explicitly named them subpackages
. A sub package should only ever belong to one main package.
Well, I think that distributing binaries that elastic-agent executes is a different and broader discussion :slightly_smiling_face: I would leave this out of this proposal.
I guess that packages with collector binaries will also need to include information, like mappings, about the additional data these binaries collect, this is something I am explicitly excluding here.
Also the dependencies direction of this kind of packages could be different. If we have packages to distribute collector binaries, we could have a "collector" package for Metricbeat, and all integration packages with metrics collected by Metricbeat would depend on it.
This topic popped up in the context of a potential stream command in elastic-package that allows you to ship synthetics data for any package in the registry to your cluster: https://github.com/elastic/elastic-package/issues/1541 The problem, the files under _dev
are not part of the package served by the registry, same for testdata
directories and other. These files are removed during the build process (couldn't find the code quickly) which makes sense as it would increase the package size.
The simplest scenario I could see here is that during the publishing process, there is a elastic-package build
and and elastic-package build -raw
that copies over all the files. This would result in a package like kubernetes-1.55.0-raw.zip
which would could also be downloaded from the registry but is not directly used by Fleet itself (at least not by default) but is useful for development.
The simplest scenario I could see here is that during the publishing process, there is a
elastic-package build
and andelastic-package build -raw
that copies over all the files. This would result in a package likekubernetes-1.55.0-raw.zip
which would could also be downloaded from the registry but is not directly used by Fleet itself (at least not by default) but is useful for development.
I think it could make sense to build and publish source packages, but I would consider this a different issue. This is a common practice in other open source packaging systems, I have created an issue to support that: https://github.com/elastic/elastic-package/issues/1577
Another option would be to include the build information, something we also want to do (https://github.com/elastic/package-spec/issues/446). This would allow to find the source files in the source repository, but would indeed be an additional step.
Bumping this as it'll be quite important as part of the OTel integrations project, but also for things like https://github.com/elastic/integrations/tree/main/packages/security_detection_engine which has its own set of issues with installation due to its size. Having it designated as a content-only integration and going through a different installation mechanism more optimized for large integrations with many assets will be a positive change overall for Fleet's memory footprint and the overall UX of integrations like this.
Tweaking the title + labels so this appears properly on the OTel board as a milestone
Thinking on the OTEL use case, we should probably relax the restrictions on dependencies with integration packages. With OTEL there may be no integration package that collects the data.
We should look to implement a separate installation path in Kibana that's optimized for content-only integrations. We already have a use case for this with large packages like https://github.com/elastic/integrations/tree/main/packages/security_detection_engine that run into memory issues when bulk deleting/importing assets during the existing installation process.
@xcrzx has been doing great work on improving the memory pressure during package installation for the rules package here https://github.com/elastic/kibana/issues/187969, but these are only incremental improvements that don't necessarily address the root cause in the long term.
For content-only integrations, I think we could optimize Fleet's installation code to avoid potentially expensive operations in a situation where a package has many assets.
Initial definition for content
packages merged in the spec https://github.com/elastic/package-spec/pull/777. Planned to be released as beta in 3.4.0.
Next steps will be to prepare support in elastic-package and the Package Registry. And eventually add support to distribute more kinds of assets and resources.
I think we can consider this done because the UI work is being tracked separately in https://github.com/elastic/kibana/issues/192484.
Support for discovery features in package-registry.
@mrodm is there anything left to do here to expose the discovery
properties in EPR? An issue to implement discovery in Kibana is part of the requirements in the above UI issue, and I don't think we have anything left to implement here.
Next steps will be to prepare support in elastic-package and the Package Registry. And eventually add support to distribute more kinds of assets and resources.
I created https://github.com/elastic/package-spec/issues/803 as a follow-up to support more asset types.
With follow-up issues created for the remaining scope I'm closing this. Our initial support for content packages is implemented and well tested. Thanks all!
Support for discovery features in package-registry.
@mrodm is there anything left to do here to expose the
discovery
properties in EPR? An issue to implement discovery in Kibana is part of the requirements in the above UI issue, and I don't think we have anything left to implement here.
@kpollich It would be missing to add support in Elastic Package Registry to show/search packages according to discovery features. Support requests in EPR like
GET /search?discovery=fields:process.pid,user.id
to return packages that can leverage documents that include the process.pid or the user.id fields
cc @jsoriano
Got it - thanks for clarifying + creating that issue, Mario. I'm not sure yet what the priority is on the discovery feature here, but we will clarify that soon 🙂
Got it - thanks for clarifying + creating that issue, Mario. I'm not sure yet what the priority is on the discovery feature here, but we will clarify that soon 🙂
@kpollich just created an issue for that https://github.com/elastic/package-registry/issues/1229
After a conversation about #346 with @ruflin we came to the conclusion that we may want to have a package type for additional content, something as DLCs for integration packages.
This package type would only contain assets, and references to an integration package. Packages of this type could be only installed if the referenced integration package is installed.
Use cases for these packages:
Data streams, or anything that alters ingestion of data would be excluded from these packages, to avoid complex dependencies and interactions.
Some notes about this:
nginx
package view, you can also discover "content" packages for nginx.Tasks
content
package type in package spec.content
packages in package-registry.discovery
features in package-registry https://github.com/elastic/package-registry/issues/1229