Open MadVikingGod opened 2 years ago
I dug into this and I don't think we can avoid this.
This is because the opentelemetry-proto-go only defines the services and the request message with the grpc service. So for us to make HTTP separate from GRPC the service request message would need to be moved.
I dug into this and I don't think we can avoid this.
Should this issue be closed?
I would close this if either we don't think we will ever change how the grpc is generated, or if we decide that this isn't a goal.
If the former, we should put an issue into proto-go to address this.
For the latter, I think we should make an issue to reduce the mods we have in the otlptrace
directory. If you will have the grpc imported no matter what I don't see the need to break HTTP vs grpc.
I will move what I have written on the Slack here. Thanks to Damien who has informed me about this.
grpc/protobuf dependency is bringing in lots of packages and increasing the size tremendously for a simple hello world application. Due to this, SDK becomes unreasonable for constraint environments.
I have a helloworld application instrumented with the API layer. When I link it with the SDK layer, the size is going off the charts. I compiled the application with go build -ldflags="-s -w"
The SDK is using console exporter, otlphttp exporter, resources, semconv
.
2.9M Feb 24 07:44 main.api
11M Feb 24 07:45 main.sdk
It is almost 8MB difference. I profiled both applications with goweight
and the top list is for grpc and protobuf stuff.
5.8 MB [google.golang.org/protobuf/internal/impl](http://google.golang.org/protobuf/internal/impl)
4.0 MB [golang.org/x/net/http2](http://golang.org/x/net/http2)
3.2 MB [google.golang.org/grpc](http://google.golang.org/grpc)
2.4 MB [google.golang.org/grpc/internal/transport](http://google.golang.org/grpc/internal/transport)
2.3 MB [google.golang.org/protobuf/internal/filedesc](http://google.golang.org/protobuf/internal/filedesc)
1.9 MB [golang.org/x/sys/unix](http://golang.org/x/sys/unix)
1.5 MB [github.com/grpc-ecosystem/grpc-gateway/v2/runtime](http://github.com/grpc-ecosystem/grpc-gateway/v2/runtime)
1.4 MB text/template/parse
1.4 MB text/template
1.3 MB [go.opentelemetry.io/otel/sdk/trace](http://go.opentelemetry.io/otel/sdk/trace)
1.3 MB [google.golang.org/protobuf/reflect/protoreflect](http://google.golang.org/protobuf/reflect/protoreflect)
1.3 MB [google.golang.org/protobuf/types/descriptorpb](http://google.golang.org/protobuf/types/descriptorpb)
1.2 MB encoding/xml
1.2 MB [github.com/golang/protobuf/proto](http://github.com/golang/protobuf/proto)
1.1 MB html/template
1.1 MB [google.golang.org/protobuf/encoding/protojson](http://google.golang.org/protobuf/encoding/protojson)
994 kB [google.golang.org/protobuf/reflect/protodesc](http://google.golang.org/protobuf/reflect/protodesc)
945 kB [golang.org/x/text/unicode/norm](http://golang.org/x/text/unicode/norm)
I don’t understand why grpc/protobuf dependencies need to be linked in. Am I missing a configuration or is there a bug in bringing in dependencies. Let me know if you want to see the helloworld application.
Exporters are
"go.opentelemetry.io/otel/exporters/otlp/otlptrace"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
"go.opentelemetry.io/otel/exporters/stdout/stdouttrace"
@utezduyar protobuf dependencies are needed as otlptracehttp
is basically profobuf over HTTP. The issue is that gRPC dependencies should not be present.
EDIT: TBH I am shocked that go build
does not get rid of the gRPC code 😨
@pellared I am a bit confused. I WireSharked my application and it posted the trace data as JSON to /v1/traces of my local collector. Where is the protobuf then?
I figured it out. It is the resource information that is posted as json but you are right, I saw the posts as application/x-protobuf on WireShark. Thanks for clarifying.
@MadVikingGod Are your findings from 1 year ago still valid? If so, is it a complex task to move the service request message?
@MadVikingGod Are your findings from 1 year ago still valid? If so, is it a complex tax to move the service request message?
Moving the service request is almost certainly not something the proto team want to do, but ultimately it is question for them.
I am not really understanding the problem or the discussion of moving or not due to my lack of decent understanding of the internals (yet at least). Can service request message not be duplicated instead of moving? If the service request message is requiring grpc, how is HTTP 1.1 trace export going to work? I checked C++ SDK and they don't have this dependency. What is really different with go?
@MrAlias I looked at it a bit more and I think I understand it better. I still think my questions are valid regarding for example how C++ can handle it but not go. Anyways. I am trying to figure out if this is not something it can be solved in the https://github.com/open-telemetry/opentelemetry-proto-go by having 2 generated code for the trace service, one with GRPC one without, as separate modules. Opinion?
@MadVikingGod (CC: @paivagustavo as the author of 2371bb0a0
)
I investigated this more and I don't believe it is due to the service message request. I think this is something we can solve it in the opentelemetry-go repo.
otlptracehttp exporter is not using the grpc. https://github.com/open-telemetry/opentelemetry-go/blob/v1.14.0/exporters/otlp/otlptrace/otlptracehttp/client.go#L129
It uses the data type coltracepb.ExportTraceServiceRequest
but that is it. The data type is defined in https://github.com/open-telemetry/opentelemetry-proto-go/blob/v0.19.0/otlp/collector/trace/v1/trace_service.pb.go
which is also not using grpc.
I don't know if go mod why is listing all the dependencies but what is written here is a culprit.
go mod why google.golang.org/grpc
...
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp
go.opentelemetry.io/otel/exporters/otlp/otlptrace/internal/otlpconfig
google.golang.org/grpc
The config data types internal/otlpconfig
is bundling GRPC and HTTP data fields together. Both NewHTTPConfig
and NewGRPCConfig
are returning Config data type which is this:
Config struct {
// Signal specific configurations
Traces SignalConfig
RetryConfig retry.Config
// gRPC configurations
ReconnectionPeriod time.Duration
ServiceConfig string
DialOptions []grpc.DialOption
GRPCConn *grpc.ClientConn
}
go mod why
will show a shortest path in the import graph from the module, it does not show all.
There have been prior attempts to separate otlpconfig
to address this. Ultimately, it will not address the issue because go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp
imports go.opentelemetry.io/proto/otlp
and go.opentelemetry.io/proto/otlp
imports google.golang.org/grpc
. There still exists an indirect dependency on go.opentelemetry.io/proto/otlp
after otlpconfig
was split.
grpc dependency is in otlp/collector/trace/v1/trace_service.pb.gw.go
and otlp/collector/trace/v1/trace_service_grpc.pb.go
but not in otlp/collector/trace/v1/trace_service.pb.go
which is used by otlptracehttp
. If the go compiler cannot figure out that it should not bring in the grpc dependency on an unused code (probably because of interface related decision) then I would say maybe https://github.com/open-telemetry/opentelemetry-proto-go
should be split to have generated files to different modules. At least I believe it is more possible than convincing the proto team.
I wanted to see what more things are dependent on grpc and experimented. otlpconfig and proto modules as we discussed before are the only dependencies.
otlpconfig split: go.opentelemetry.io/otel/exporters/otlp/otlptrace/internal/otlpconfig/options.go: Removed all the mentions of grpc (for the sake of moving foward).
proto files: go.opentelemetry.io/proto/otlp/otlp/collector/trace/v1: I deleted both the service and the gateway files (trace_service.pb.gw.go and trace_service_grpc.pb.go).
This way, grpc dependency was dropped and my binary size was down 20%.
I would like to discuss the possibility of splitting service and gateway files in a separate module to get around go compiler downside. Some contributors in this issue are also committers in opentelemetry-proto-go. What is the best way to discuss this topic with opentelemetry-proto-go?
I would like to discuss the possibility of splitting service and gateway files in a separate module to get around go compiler downside. Some contributors in this issue are also committers in opentelemetry-proto-go. What is the best way to discuss this topic with opentelemetry-proto-go?
Can you provide a proposal for how go.opentelemetry.io/proto/otlp
will be changed and how the opentelemetry-proto-go will be updated to support the changes?
I think the protobuf definition is already good enough and the code generator is doing it's best at splitting code based on different use cases. Therefore I do NOT believe we need a change from the go.opentelemetry.io/proto/otlp repo.
The collector/
under the opentelemetry-proto-go is having 3 logical go file separation for 3 telemetry types. One file for the data type, one file for the grpc client and one file for the grpc server.
My propose is to have grpc client and grpc server into it's own module in the opentelemetry-proto-go repo.
Again, if the go linker could have been able to figure out that it does not need to link the grpc libraries because the code is not used, we wouldn't need to break the generated go files.
The
collector/
under the opentelemetry-proto-go is having 3 logical go file separation for 3 telemetry types. One file for the data type, one file for the grpc client and one file for the grpc server.My propose is to have grpc client and grpc server into it's own module in the opentelemetry-proto-go repo.
Your proposal is to duplicate the code in go.opentelemetry.io/proto/otlp
into new modules? This is not a clear solution to me.
Not duplicate but split the files in the collector/
. https://github.com/open-telemetry/opentelemetry-proto-go/tree/main/otlp/collector/trace/v1
https://github.com/open-telemetry/opentelemetry-proto-go/tree/main/otlp/collector/metrics/v1
and https://github.com/open-telemetry/opentelemetry-proto-go/tree/main/otlp/collector/logs/v1
.
Let's take the tracing as example:
Module 1: https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.go
Module 2: https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.gw.go
and https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service_grpc.pb.go
Let's take the tracing as example: Module 1:
https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.go
Module 2:https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.gw.go
andhttps://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service_grpc.pb.go
How do you plan to split go.opentelemetry.io/proto/otlp/collector/trace
and have two packages with exactly the same package name? Modules, and packages, are not split by files, they are split by directories.
If you move something out of go.opentelemetry.io/proto/otlp/collector
it will be a breaking change to the exiting package. This will be backwards incompatible.
I see your point. If we cannot break backwards compatibility, then not sure what options we have left but to duplicate go.opentelemetry.io/proto/otlp
into new modules.
It is not going to help to make changes in the proto files since all the generated code goes into one module, go.opentelemetry.io/proto/otlp
, that the linker is going to pick up grpc code, even when they are unused.
Does this have a known performance impact, or is it about disk space?
For my use case it is disk space. I would expect minimum to 0 impact on the performance. Maybe extra decompression if UPX is used and wasted memory when UPX is unpacked to the memory. If UPX is not used, I believe the pages of the grpc code will never be brought in to the memory if they are not called and not impact anything.
Wasting some megabytes in persistent space is probably not a problem for cloud services but I believe any megabyte saved in constraint environments is important, especially for something that is not in use.
I would close this if either we don't think we will ever change how the grpc is generated, or if we decide that this isn't a goal.
If the former, we should put an issue into proto-go to address this. For the latter, I think we should make an issue to reduce the mods we have in the
otlptrace
directory. If you will have the grpc imported no matter what I don't see the need to break HTTP vs grpc.
Even after 1 year, I believe @MadVikingGod 's comments are still relevant (sorry, it took me a while to come to the same point). I have created an issue in the opentelemetry-proto-go repo (above).
If we would like to solve this issue, I cannot come up with a better plan than below. A) Split proto-go files in modules in a backwards incompatible way and publish both major versions. B) Work on the internal/otlpconfig to split grpc dependencies.
If we don't want to split the dependencies, then maybe we should close this ticket and open a new one to reduce the mods in the otlptrace
The last option is to mark the ticket as we will not make any change. If we pick this option, I am considering to carry a downstream patch that removes the grpc dependencies and use the patch.
The slim/otlp
module was just released. This issue should be resolvable now.
Huge thanks @pellared for this. I cannot wait to test it to see that the binary size has gone down a lot.
@utezduyar, I will try to get it done for next release.
Unfortunately, it is not possible (without making a breaking change) to remove the dependency on google.golang.org/grpc
in otlptracehttp
because of https://pkg.go.dev/go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp#NewClient as https://pkg.go.dev/go.opentelemetry.io/otel/exporters/otlp/otlptrace#Client exposes the proto package which depends on google.golang.org/grpc
.
The fix would require creating v2 modules. For otlptracehttp
and otlptracegrpc
. After these are created we still need to support the v1 modules for at least a year. I think for ease of review we should create the new v2 modules of otlptracehttp
and otlptracegrpc
"from scratch" (similar to otlploghttp
).
More: https://github.com/open-telemetry/opentelemetry-go/pull/5221
@pellared are you planning to split the otlptracehttp work to its own issue so we can close this with #5222?
@pellared are you planning to split the otlptracehttp work to its own issue so we can close this with #5222?
Done:
Fixed for metrics. For tracing we need v2 modules (tracked as separate issues).
FYI. it looks like its no longer possible to have both otelmetrichttp and otelmetricgrpc in the same binary. I'm not sure how common this is, but go.opentelemetry.io/contrib/exporters/autoexport is an example.
$ git clone https://github.com/open-telemetry/opentelemetry-go-contrib.git
$ cd opentelemetry-go-contrib/exporters/autoexport/
$ go get go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp@main
go: upgraded go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.25.0 => v1.25.1-0.20240422142149-b34cfc47c4e0
go: added go.opentelemetry.io/proto/slim/otlp v1.2.0
$ go get go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc@main
go: downloading go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.25.1-0.20240422142149-b34cfc47c4e0
go: upgraded go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.25.0 => v1.25.1-0.20240422142149-b34cfc47c4e0
$ go test ./...
panic: proto: file "opentelemetry/proto/common/v1/common.proto" is already registered
previously from: "go.opentelemetry.io/proto/otlp/common/v1"
currently from: "go.opentelemetry.io/proto/slim/otlp/common/v1"
See https://protobuf.dev/reference/go/faq#namespace-conflict
He had to revert the changes because of the comment above.
More on the issue:
If a .proto file does not specify a package name or uses an overly generic package name (for example, “my_service”), then there is a high probability that declarations within that file will conflict with other declarations elsewhere in the universe.
A .proto file's name (as understood by protoc) is a primary key for its file descriptor. If multiple files have the same name, then looking up files by name breaks. The protobuf maintainers consider this to be sufficient reason to forbid naming conflicts.
It is possible to look up a file descriptor by file name. In Go, this is protoregistry.FindFileByPath. FindFileByPath does not allow for multiple files with the same name; it assumes that every file name is unique.
While protoregistry.FindFileByPath is a Go API, there are equivalents in C++, Java, and other languages. None of them, so far as I know, allow for multiple files with the same name.
We would need to use the same packages for HTTP and gRPC OTLP exporters.
For me it looks like hard to fix (it would require changes in google.golang.org/grpc
). I think we could only provide some OTLP Slim HTTP exporters which would not work with together (in the same process) with OTLP gRPC exporters. I do not find it feasible.
Unfortunately, I am leaning towards closing it as "won't fix" leaving the issue open.
Could this be fixed with a v2 of proto-go where the protobufs wouldn't be in a slim package, but always in separate packages from the grpc service?
AFAIU it won't help as as they will still have the same "file descriptors".
Removing this out of the v1.26.0
milestone
Could this be fixed with a v2 of proto-go where the protobufs wouldn't be in a slim package, but always in separate packages from the grpc service?
We need to separate the output folder for the grpc method struct and the grpc server, as the HTTP uses the grpc method. Not sure it is feasible for protoc
.
To fix it we would need a fix in https://github.com/golang/protobuf and probably in protoc generation as well.
Description
This was surfaced in the discussion:https://github.com/open-telemetry/opentelemetry-go/discussions/2509
The otlptracehttp client and otlptrace depend on grpc. This is a transitive dependency from internal/otlpconfig. This is counter to why we split these packages into separate mods in the first place.
Environment
Steps To Reproduce
cd exporters/otlp/otlptrace/otlptracehttp
go mod graph | grep grpc
this should be emptygo mod why google.golang.org/grpc
this should report that module does not need grpc.Expected behavior
The HTTP client should not depend on grpc.
Aditional information
otlpconfig
has a config struct that is shared between HTTP client and grpc. This has logic around extracting the variables from the environment. Because the config has settings for grpc this means importingotlpconfig
will import grpc (even if they are later pruned).