Closed djaglowski closed 2 years ago
Parsing log data which is not generated from OpenTelemetry seems critical to me and to this contribution. I would like to see us get the schema incorporated before this happens: https://github.com/open-telemetry/oteps/pull/199 This makes the logging far more useful and relevant when it comes to log correlation.
Parsing log data which is not generated from OpenTelemetry seems critical to me and to this contribution.
This isn't proposing a contribution - only a change in code management. In any case, parsing log data which is not generated from OpenTelemetry is exactly what this code base does, aside from collecting log data. This capability is currently built into every log receiver that uses the log-collection library.
I would like to see us get the schema incorporated before this happens: open-telemetry/oteps#199 This makes the logging far more useful and relevant when it comes to log correlation.
As far as I can tell, your concern is orthogonal to this proposal. I 100% would like to see technology-specific parsing built into the collector, but there's no reason this should be a prerequisite for the generic capabilities that already exist. Can you elaborate on how this relates to whether the log collection code resides within the collector itself, vs being imported as a dependency that is separately managed within the OpenTelemetry project?
I support this proposal
+1
This makes sense to me 👍
+1
+1
Are there consumers outside of the collector contrib repo that we know of? Do we need to leave forwarding type aliases?
observIQ has some builds that use the log-collection repo, but pinned to older releases so no forwarding is necessary in this case. I'm fairly certain this is only outside consumer of the library.
+1 also from me. Would you be open to demoing the log-collector during a SIG call?
+1 This makes sense to me
@jpkrohling Would you mind clarifying what you'd like to see in a demo? There's no longer a standalone executable to demo, so a user-facing demo could focus on configuring the existing log receivers, or I can explain how the log receivers work internally.
or I can explain how the log receivers work internally
That's more like what I had in mind.
We missed the opportunity to do the demo (maybe later), but I think we should move forward with this. @djaglowski happy to review a PR.
@jpkrohling I'm happy to give a brief overview in next week's SIG.
@bogdandrutu I'm wrapping up a few more changes for the final release of log-collection, then will have a couple PRs to execute the merge.
The initial code transfer has been completed in #9922. Further steps will be tracked in #9960.
Background
opentelemetry-log-collection provides substantial functionality for several log receivers in the contrib collector. It exists as a separate repository based on the original proposal to contribute stanza to OpenTelemetry.
Since the contribution, the repository has been reshaped from a standalone agent into a simplified library containing only elements relevant to the OpenTelemetry Collector. Along the way, many enhancements have been made according to feedback from the community. As of this writing, the library's internal data model is in the final stages of an overhaul that will bring it into close alignment with the OpenTelemetry Log Data Model.
Motivation
As a maintainer of the log-collection repository, there have been some benefits to independent maintenance of the project. However, I believe we are nearly at the point where its existence as a standalone repository is both unnecessary and, from a development standpoint, inefficient.
Benefits of merging would include, at least:
Detailed Proposal
Development on the log-collection repository should be wound down. A final release, v0.29.0, should include all meaningful in-progress work. Therefore, any PRs that will not be included in v0.29.0 should be closed and captured as issues on the contrib repo. Likewise, any issues that will not be resolved in v0.29.0 should be moved to contrib. All log receivers using the log-collection library should be updated to v0.29.0, so that the switchover of the codebase will be an isolated action with no expected change to functionality.
A new module should be created at either
pkg/stanza
orpkg/logcollection
. (I slightly prefer the former but defer to the community on whether it is appropriate to reflect the legacy of the codebase in this way.) All packages from the log-collection library should be copied into this new module and updated to reflect their new package paths. Log receivers dependent upon these packages should be updated accordingly. Likely later,internal/stanza
should be colocated by moving it to anadapter
package within the module.Myself and @jsirianni should be listed as code owners of the new module.
Tooling dependency and CI workflow changes can be proposed on a case-by-case basis. I have performed an audit and found only a few very minor differences - nothing that cannot reasonably be ported or dropped.
The log-collection README should be updated to indicate that the standalone library has been deprecated and should provide a link to the new contrib module.
The log-collection repo should be archived in order to preserve development and release history.