googleapis / api-linter

A linter for APIs defined in protocol buffers.
https://linter.aip.dev/
Apache License 2.0
599 stars 144 forks source link

Invalid memory address when using google.longrunning.Operation without operation_info #1399

Closed arjca closed 3 months ago

arjca commented 5 months ago

Hi,

One method of my Protobuf service is expected to return early, and proceed to computations after the response. Google Cloud guidelines say this method must return a google.longrunning.Operation.

When following this rule, api-linter returns an error: runtime error: invalid memory address or nil pointer dereference. There is no error if I replace google.longrunning.Operation by google.protobuf.Empty, for instance.

Stack trace with --debug is the following:

goroutine 1 [running]:
runtime/debug.Stack()
        /usr/lib/go/src/runtime/debug/stack.go:24 +0x5e
runtime/debug.PrintStack()
        /usr/lib/go/src/runtime/debug/stack.go:16 +0x13
github.com/googleapis/api-linter/lint.(*Linter).runAndRecoverFromPanics.func1()
       [...]/api-linter/lint/lint.go:127 +0x52
panic({0xa6c9a0?, 0x10a5ab0?})
        /usr/lib/go/src/runtime/panic.go:770 +0x132
github.com/jhump/protoreflect/desc.(*MessageDescriptor).GetName(...)
        [...]/go/pkg/mod/github.com/jhump/protoreflect@v1.16.0/desc/descriptor.go:398
github.com/googleapis/api-linter/rules/internal/utils.LintMethodHasMatchingResponseName(0xc0002fe140)
       [...]/api-linter/rules/internal/utils/common_lints.go:191 +0x115
github.com/googleapis/api-linter/rules/aip0136.init.func9(0xc0002fe140)
        [...]/api-linter/rules/aip0136/response_message_name.go:44 +0x27
github.com/googleapis/api-linter/lint.(*MethodRule).Lint(0x10b44c0, 0x0?)
       [...]/api-linter/lint/rule.go:213 +0x11b
github.com/googleapis/api-linter/lint.(*Linter).runAndRecoverFromPanics(0xc0001db080?, {0xc26a80?, 0x10b44c0?}, 0xc0001372c0?)
        [...]/api-linter/lint/lint.go:137 +0x85
github.com/googleapis/api-linter/lint.(*Linter).lintFileDescriptor(0xc000690210, 0xc0001990a0)
        [...]/api-linter/lint/lint.go:97 +0x1e7
github.com/googleapis/api-linter/lint.(*Linter).LintProtos(0xc000690210, {0xc00011a920, 0x1, 0x0?})
        [...]/api-linter/lint/lint.go:71 +0xbd
main.(*cli).lint(0xc000163bc0, 0xc0002f92f0, {0x112e0a0?, 0x0?, 0x0?})
        [...]/api-linter/cmd/api-linter/cli.go:195 +0xa98
main.runCLI({0xc00012e010?, 0xc00010e058?, 0x104bfe0?})
        [...]/api-linter/cmd/api-linter/main.go:46 +0x39
main.main()
        [...]/api-linter/cmd/api-linter/main.go:39 +0x49

The error is produced in rule AIP-136 when using a method from utils: using GetResponseType on the desc.MethodDescriptor of the method returning google.longrunning.Operation returns nil.

Environment details

Steps to reproduce

api-linter returns the error with the following code:

syntax = "proto3";

package arjca.prj.v1;

import "google/api/annotations.proto";
import "google/longrunning/operations.proto";

service AwesomeService {
  rpc Process(ProcessRequest) returns (google.longrunning.Operation);
}

message ProcessRequest {}

I uploaded a repository with this code and dependencies (googleapis Protobuf files): https://github.com/arjca/api-linter-bug-longrunning. To reproduce: clone it, fetch the submodule (make init) and finally run api-linter (make lint).

arjca commented 5 months ago

I investigated a bit and found something odd.

In the function GetOperationInfo of file rules/internal/utils/extension.go, there is a check for nil value at line 47. If the value is not nil, it is then cast as a pointer to lrpb.OperationInfo. However, after the cast, value is now nil!

See:

opts := m.GetMethodOptions()
if x := proto.GetExtension(opts, lrpb.E_OperationInfo); x != nil {
    fmt.Println(x == nil) // => false
    fmt.Println(x.(*lrpb.OperationInfo) == nil) // => true (!)

    return x.(*lrpb.OperationInfo)
}

I got this case because I did not put the option operation_info in my method. I then modified my example:

syntax = "proto3";

package arjca.prj.v1;

import "google/api/annotations.proto";
import "google/longrunning/operations.proto";

service AwesomeService {
  rpc Process(ProcessRequest) returns (google.longrunning.Operation) {
    option (google.longrunning.operation_info) = {
      response_type: "ProcessResponse"
    };
  };
}

message ProcessRequest {
}

message ProcessResponse {
}

And it works!

Still, I think that the behavior with the missing google.longrunning.operation_info option is not desired. The error message should be more compelling, so that users understand what they have to modify. Or perhaps the linter should not crash at all, as the absence of this option should be covered by AIP-151.

noahdietz commented 3 months ago

as the absence of this option should be covered by AIP-151.

Indeed - if we cannot derive the LRO response type, we shouldn't crash, we should just exit. Thanks for the report (and for doing all of that debugging). I'll send a patch