open-telemetry / opentelemetry-go

OpenTelemetry Go API and SDK
https://opentelemetry.io/
Apache License 2.0
5.13k stars 1.04k forks source link

otlp*http modules import grpc #2579

Open MadVikingGod opened 2 years ago

MadVikingGod commented 2 years ago

Description

This was surfaced in the discussion:https://github.com/open-telemetry/opentelemetry-go/discussions/2509

The otlptracehttp client and otlptrace depend on grpc. This is a transitive dependency from internal/otlpconfig. This is counter to why we split these packages into separate mods in the first place.

Environment

Steps To Reproduce

  1. cd exporters/otlp/otlptrace/otlptracehttp
  2. go mod graph | grep grpc this should be empty
  3. go mod why google.golang.org/grpc this should report that module does not need grpc.

Expected behavior

The HTTP client should not depend on grpc.

Aditional information

otlpconfig has a config struct that is shared between HTTP client and grpc. This has logic around extracting the variables from the environment. Because the config has settings for grpc this means importing otlpconfig will import grpc (even if they are later pruned).

MadVikingGod commented 2 years ago

I dug into this and I don't think we can avoid this.

This is because the opentelemetry-proto-go only defines the services and the request message with the grpc service. So for us to make HTTP separate from GRPC the service request message would need to be moved.

MrAlias commented 2 years ago

I dug into this and I don't think we can avoid this.

Should this issue be closed?

MadVikingGod commented 2 years ago

I would close this if either we don't think we will ever change how the grpc is generated, or if we decide that this isn't a goal.

If the former, we should put an issue into proto-go to address this. For the latter, I think we should make an issue to reduce the mods we have in the otlptrace directory. If you will have the grpc imported no matter what I don't see the need to break HTTP vs grpc.

utezduyar commented 1 year ago

I will move what I have written on the Slack here. Thanks to Damien who has informed me about this.

grpc/protobuf dependency is bringing in lots of packages and increasing the size tremendously for a simple hello world application. Due to this, SDK becomes unreasonable for constraint environments.

I have a helloworld application instrumented with the API layer. When I link it with the SDK layer, the size is going off the charts. I compiled the application with go build -ldflags="-s -w" The SDK is using console exporter, otlphttp exporter, resources, semconv.

2.9M Feb 24 07:44 main.api
11M Feb 24 07:45 main.sdk

It is almost 8MB difference. I profiled both applications with goweight and the top list is for grpc and protobuf stuff.

5.8 MB [google.golang.org/protobuf/internal/impl](http://google.golang.org/protobuf/internal/impl)
4.0 MB [golang.org/x/net/http2](http://golang.org/x/net/http2)
3.2 MB [google.golang.org/grpc](http://google.golang.org/grpc)
2.4 MB [google.golang.org/grpc/internal/transport](http://google.golang.org/grpc/internal/transport)
2.3 MB [google.golang.org/protobuf/internal/filedesc](http://google.golang.org/protobuf/internal/filedesc)
1.9 MB [golang.org/x/sys/unix](http://golang.org/x/sys/unix)
1.5 MB [github.com/grpc-ecosystem/grpc-gateway/v2/runtime](http://github.com/grpc-ecosystem/grpc-gateway/v2/runtime)
1.4 MB text/template/parse
1.4 MB text/template
1.3 MB [go.opentelemetry.io/otel/sdk/trace](http://go.opentelemetry.io/otel/sdk/trace)
1.3 MB [google.golang.org/protobuf/reflect/protoreflect](http://google.golang.org/protobuf/reflect/protoreflect)
1.3 MB [google.golang.org/protobuf/types/descriptorpb](http://google.golang.org/protobuf/types/descriptorpb)
1.2 MB encoding/xml
1.2 MB [github.com/golang/protobuf/proto](http://github.com/golang/protobuf/proto)
1.1 MB html/template
1.1 MB [google.golang.org/protobuf/encoding/protojson](http://google.golang.org/protobuf/encoding/protojson)
994 kB [google.golang.org/protobuf/reflect/protodesc](http://google.golang.org/protobuf/reflect/protodesc)
945 kB [golang.org/x/text/unicode/norm](http://golang.org/x/text/unicode/norm)

I don’t understand why grpc/protobuf dependencies need to be linked in. Am I missing a configuration or is there a bug in bringing in dependencies. Let me know if you want to see the helloworld application.

Exporters are

"go.opentelemetry.io/otel/exporters/otlp/otlptrace"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
"go.opentelemetry.io/otel/exporters/stdout/stdouttrace"
pellared commented 1 year ago

@utezduyar protobuf dependencies are needed as otlptracehttp is basically profobuf over HTTP. The issue is that gRPC dependencies should not be present.

EDIT: TBH I am shocked that go build does not get rid of the gRPC code 😨

utezduyar commented 1 year ago

@pellared I am a bit confused. I WireSharked my application and it posted the trace data as JSON to /v1/traces of my local collector. Where is the protobuf then?

pellared commented 1 year ago

@utezduyar The payload send via HTTP is serialized using protobuf .

utezduyar commented 1 year ago

I figured it out. It is the resource information that is posted as json but you are right, I saw the posts as application/x-protobuf on WireShark. Thanks for clarifying.

utezduyar commented 1 year ago

@MadVikingGod Are your findings from 1 year ago still valid? If so, is it a complex task to move the service request message?

MrAlias commented 1 year ago

@MadVikingGod Are your findings from 1 year ago still valid? If so, is it a complex tax to move the service request message?

Moving the service request is almost certainly not something the proto team want to do, but ultimately it is question for them.

utezduyar commented 1 year ago

I am not really understanding the problem or the discussion of moving or not due to my lack of decent understanding of the internals (yet at least). Can service request message not be duplicated instead of moving? If the service request message is requiring grpc, how is HTTP 1.1 trace export going to work? I checked C++ SDK and they don't have this dependency. What is really different with go?

utezduyar commented 1 year ago

@MrAlias I looked at it a bit more and I think I understand it better. I still think my questions are valid regarding for example how C++ can handle it but not go. Anyways. I am trying to figure out if this is not something it can be solved in the https://github.com/open-telemetry/opentelemetry-proto-go by having 2 generated code for the trace service, one with GRPC one without, as separate modules. Opinion?

utezduyar commented 1 year ago

@MadVikingGod (CC: @paivagustavo as the author of 2371bb0a0)

I investigated this more and I don't believe it is due to the service message request. I think this is something we can solve it in the opentelemetry-go repo.

otlptracehttp exporter is not using the grpc. https://github.com/open-telemetry/opentelemetry-go/blob/v1.14.0/exporters/otlp/otlptrace/otlptracehttp/client.go#L129 It uses the data type coltracepb.ExportTraceServiceRequest but that is it. The data type is defined in https://github.com/open-telemetry/opentelemetry-proto-go/blob/v0.19.0/otlp/collector/trace/v1/trace_service.pb.go which is also not using grpc.

I don't know if go mod why is listing all the dependencies but what is written here is a culprit.

go mod why google.golang.org/grpc
...
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp
go.opentelemetry.io/otel/exporters/otlp/otlptrace/internal/otlpconfig
google.golang.org/grpc

The config data types internal/otlpconfig is bundling GRPC and HTTP data fields together. Both NewHTTPConfig and NewGRPCConfig are returning Config data type which is this:

Config struct {
        // Signal specific configurations
        Traces SignalConfig
        RetryConfig retry.Config
        // gRPC configurations
        ReconnectionPeriod time.Duration
        ServiceConfig      string
        DialOptions        []grpc.DialOption
        GRPCConn           *grpc.ClientConn
    }
MrAlias commented 1 year ago

go mod why will show a shortest path in the import graph from the module, it does not show all.

There have been prior attempts to separate otlpconfig to address this. Ultimately, it will not address the issue because go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp imports go.opentelemetry.io/proto/otlp and go.opentelemetry.io/proto/otlp imports google.golang.org/grpc. There still exists an indirect dependency on go.opentelemetry.io/proto/otlp after otlpconfig was split.

utezduyar commented 1 year ago

grpc dependency is in otlp/collector/trace/v1/trace_service.pb.gw.go and otlp/collector/trace/v1/trace_service_grpc.pb.go but not in otlp/collector/trace/v1/trace_service.pb.go which is used by otlptracehttp. If the go compiler cannot figure out that it should not bring in the grpc dependency on an unused code (probably because of interface related decision) then I would say maybe https://github.com/open-telemetry/opentelemetry-proto-go should be split to have generated files to different modules. At least I believe it is more possible than convincing the proto team.

utezduyar commented 1 year ago

I wanted to see what more things are dependent on grpc and experimented. otlpconfig and proto modules as we discussed before are the only dependencies.

otlpconfig split: go.opentelemetry.io/otel/exporters/otlp/otlptrace/internal/otlpconfig/options.go: Removed all the mentions of grpc (for the sake of moving foward).

proto files: go.opentelemetry.io/proto/otlp/otlp/collector/trace/v1: I deleted both the service and the gateway files (trace_service.pb.gw.go and trace_service_grpc.pb.go).

This way, grpc dependency was dropped and my binary size was down 20%.

I would like to discuss the possibility of splitting service and gateway files in a separate module to get around go compiler downside. Some contributors in this issue are also committers in opentelemetry-proto-go. What is the best way to discuss this topic with opentelemetry-proto-go?

MrAlias commented 1 year ago

I would like to discuss the possibility of splitting service and gateway files in a separate module to get around go compiler downside. Some contributors in this issue are also committers in opentelemetry-proto-go. What is the best way to discuss this topic with opentelemetry-proto-go?

Can you provide a proposal for how go.opentelemetry.io/proto/otlp will be changed and how the opentelemetry-proto-go will be updated to support the changes?

utezduyar commented 1 year ago

I think the protobuf definition is already good enough and the code generator is doing it's best at splitting code based on different use cases. Therefore I do NOT believe we need a change from the go.opentelemetry.io/proto/otlp repo.

The collector/ under the opentelemetry-proto-go is having 3 logical go file separation for 3 telemetry types. One file for the data type, one file for the grpc client and one file for the grpc server.

My propose is to have grpc client and grpc server into it's own module in the opentelemetry-proto-go repo.

Again, if the go linker could have been able to figure out that it does not need to link the grpc libraries because the code is not used, we wouldn't need to break the generated go files.

MrAlias commented 1 year ago

The collector/ under the opentelemetry-proto-go is having 3 logical go file separation for 3 telemetry types. One file for the data type, one file for the grpc client and one file for the grpc server.

My propose is to have grpc client and grpc server into it's own module in the opentelemetry-proto-go repo.

Your proposal is to duplicate the code in go.opentelemetry.io/proto/otlp into new modules? This is not a clear solution to me.

utezduyar commented 1 year ago

Not duplicate but split the files in the collector/. https://github.com/open-telemetry/opentelemetry-proto-go/tree/main/otlp/collector/trace/v1 https://github.com/open-telemetry/opentelemetry-proto-go/tree/main/otlp/collector/metrics/v1 and https://github.com/open-telemetry/opentelemetry-proto-go/tree/main/otlp/collector/logs/v1.

Let's take the tracing as example: Module 1: https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.go Module 2: https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.gw.go and https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service_grpc.pb.go

MrAlias commented 1 year ago

Let's take the tracing as example: Module 1: https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.go Module 2: https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.gw.go and https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service_grpc.pb.go

How do you plan to split go.opentelemetry.io/proto/otlp/collector/trace and have two packages with exactly the same package name? Modules, and packages, are not split by files, they are split by directories.

If you move something out of go.opentelemetry.io/proto/otlp/collector it will be a breaking change to the exiting package. This will be backwards incompatible.

utezduyar commented 1 year ago

I see your point. If we cannot break backwards compatibility, then not sure what options we have left but to duplicate go.opentelemetry.io/proto/otlp into new modules. It is not going to help to make changes in the proto files since all the generated code goes into one module, go.opentelemetry.io/proto/otlp, that the linker is going to pick up grpc code, even when they are unused.

dmathieu commented 1 year ago

Does this have a known performance impact, or is it about disk space?

utezduyar commented 1 year ago

For my use case it is disk space. I would expect minimum to 0 impact on the performance. Maybe extra decompression if UPX is used and wasted memory when UPX is unpacked to the memory. If UPX is not used, I believe the pages of the grpc code will never be brought in to the memory if they are not called and not impact anything.

Wasting some megabytes in persistent space is probably not a problem for cloud services but I believe any megabyte saved in constraint environments is important, especially for something that is not in use.

utezduyar commented 1 year ago

I would close this if either we don't think we will ever change how the grpc is generated, or if we decide that this isn't a goal.

If the former, we should put an issue into proto-go to address this. For the latter, I think we should make an issue to reduce the mods we have in the otlptrace directory. If you will have the grpc imported no matter what I don't see the need to break HTTP vs grpc.

Even after 1 year, I believe @MadVikingGod 's comments are still relevant (sorry, it took me a while to come to the same point). I have created an issue in the opentelemetry-proto-go repo (above).

If we would like to solve this issue, I cannot come up with a better plan than below. A) Split proto-go files in modules in a backwards incompatible way and publish both major versions. B) Work on the internal/otlpconfig to split grpc dependencies.

If we don't want to split the dependencies, then maybe we should close this ticket and open a new one to reduce the mods in the otlptrace

The last option is to mark the ticket as we will not make any change. If we pick this option, I am considering to carry a downstream patch that removes the grpc dependencies and use the patch.

MrAlias commented 5 months ago

The slim/otlp module was just released. This issue should be resolvable now.

utezduyar commented 5 months ago

Huge thanks @pellared for this. I cannot wait to test it to see that the binary size has gone down a lot.

pellared commented 5 months ago

@utezduyar, I will try to get it done for next release.

pellared commented 4 months ago

Unfortunately, it is not possible (without making a breaking change) to remove the dependency on google.golang.org/grpc in otlptracehttp because of https://pkg.go.dev/go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp#NewClient as https://pkg.go.dev/go.opentelemetry.io/otel/exporters/otlp/otlptrace#Client exposes the proto package which depends on google.golang.org/grpc.

The fix would require creating v2 modules. For otlptracehttp and otlptracegrpc. After these are created we still need to support the v1 modules for at least a year. I think for ease of review we should create the new v2 modules of otlptracehttp and otlptracegrpc "from scratch" (similar to otlploghttp).

More: https://github.com/open-telemetry/opentelemetry-go/pull/5221

MrAlias commented 4 months ago

@pellared are you planning to split the otlptracehttp work to its own issue so we can close this with #5222?

pellared commented 4 months ago

@pellared are you planning to split the otlptracehttp work to its own issue so we can close this with #5222?

Done:

pellared commented 4 months ago

Fixed for metrics. For tracing we need v2 modules (tracked as separate issues).

gregoryfranklin commented 4 months ago

FYI. it looks like its no longer possible to have both otelmetrichttp and otelmetricgrpc in the same binary. I'm not sure how common this is, but go.opentelemetry.io/contrib/exporters/autoexport is an example.

$ git clone https://github.com/open-telemetry/opentelemetry-go-contrib.git
$ cd opentelemetry-go-contrib/exporters/autoexport/
$ go get go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp@main
go: upgraded go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.25.0 => v1.25.1-0.20240422142149-b34cfc47c4e0
go: added go.opentelemetry.io/proto/slim/otlp v1.2.0
$ go get go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc@main
go: downloading go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.25.1-0.20240422142149-b34cfc47c4e0
go: upgraded go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.25.0 => v1.25.1-0.20240422142149-b34cfc47c4e0
$ go test ./...
panic: proto: file "opentelemetry/proto/common/v1/common.proto" is already registered
    previously from: "go.opentelemetry.io/proto/otlp/common/v1"
    currently from:  "go.opentelemetry.io/proto/slim/otlp/common/v1"
See https://protobuf.dev/reference/go/faq#namespace-conflict
pellared commented 4 months ago

He had to revert the changes because of the comment above.

More on the issue:

If a .proto file does not specify a package name or uses an overly generic package name (for example, “my_service”), then there is a high probability that declarations within that file will conflict with other declarations elsewhere in the universe.

A .proto file's name (as understood by protoc) is a primary key for its file descriptor. If multiple files have the same name, then looking up files by name breaks. The protobuf maintainers consider this to be sufficient reason to forbid naming conflicts.

It is possible to look up a file descriptor by file name. In Go, this is protoregistry.FindFileByPath. FindFileByPath does not allow for multiple files with the same name; it assumes that every file name is unique.

While protoregistry.FindFileByPath is a Go API, there are equivalents in C++, Java, and other languages. None of them, so far as I know, allow for multiple files with the same name.

We would need to use the same packages for HTTP and gRPC OTLP exporters.

For me it looks like hard to fix (it would require changes in google.golang.org/grpc). I think we could only provide some OTLP Slim HTTP exporters which would not work with together (in the same process) with OTLP gRPC exporters. I do not find it feasible.

Unfortunately, I am leaning towards closing it as "won't fix" leaving the issue open.

dmathieu commented 4 months ago

Could this be fixed with a v2 of proto-go where the protobufs wouldn't be in a slim package, but always in separate packages from the grpc service?

pellared commented 4 months ago

AFAIU it won't help as as they will still have the same "file descriptors".

XSAM commented 4 months ago

Removing this out of the v1.26.0 milestone

XSAM commented 4 months ago

Could this be fixed with a v2 of proto-go where the protobufs wouldn't be in a slim package, but always in separate packages from the grpc service?

We need to separate the output folder for the grpc method struct and the grpc server, as the HTTP uses the grpc method. Not sure it is feasible for protoc.

pellared commented 4 months ago

To fix it we would need a fix in https://github.com/golang/protobuf and probably in protoc generation as well.