google / starlark-go

Starlark in Go: the Starlark configuration language, implemented in Go
BSD 3-Clause "New" or "Revised" License
2.3k stars 209 forks source link

lib/proto: a Starlark package for protobuf processing #318

Closed alandonovan closed 3 years ago

alandonovan commented 3 years ago

This change defines an optional Starlark module for encoding and decoding protocol buffers (https://developers.google.com/protocol-buffers/).

It adds a dependency on google.golang.org/protobuf.

Fixes https://github.com/google/starlark-go/issues/309 Updates https://github.com/stripe/skycfg/issues/23

alandonovan commented 3 years ago

Hi Jay, this PR publishes our internal starlarkproto package. Nothing material has changed besides the build rules and the file names that appear in proto.file(...) calls.

pims commented 3 years ago

@alandonovan author of issue #309 here.

This is great! I tried to run tests on this branch and got a failing case:

~/codebin/go/src/go.starlark.net (starlarkproto) ⚽️  $ go test -mod=readonly ./...
?       go.starlark.net/cmd/starlark    [no test files]
?       go.starlark.net/internal/chunkedfile    [no test files]
ok      go.starlark.net/internal/compile    0.044s
?       go.starlark.net/internal/spell  [no test files]
--- FAIL: TestProto (0.00s)
    starlarktest.go:117: Traceback (most recent call last):
          /Users/tim/codebin/go/src/go.starlark.net/lib/proto/testdata/example.star:89:10: in <toplevel>
          /Users/tim/codebin/go/src/go.starlark.net/starlarktest/assert.star:14:14: in _eq
        Error: "id: 12345\nstatus: ASSIGNED\nreporter: \"adonovan\"\ntype: FEATURE\nnote: \"A note\"\nnote: \"Another note\"\nmetadata: {\n  key: \"greeting\"\n  value: \"hello\"\n}\n[bugtracker.Outer.ext]: 1\n[bugtracker.ext]: 2\n" != "<64 bytes>"
    proto_test.go:92: Traceback (most recent call last):
          /Users/tim/codebin/go/src/go.starlark.net/lib/proto/testdata/example.star:107:22: in <toplevel>
          <builtin>: in proto.unmarshal
        Error: proto.unmarshal: for parameter 2: got string, want bytes
FAIL
FAIL    go.starlark.net/lib/proto   0.033s
?       go.starlark.net/lib/proto/cmd/star2proto    [no test files]
?       go.starlark.net/repl    [no test files]
ok      go.starlark.net/resolve 0.031s
ok      go.starlark.net/starlark    1.276s
?       go.starlark.net/starlarkjson    [no test files]
ok      go.starlark.net/starlarkstruct  0.030s
?       go.starlark.net/starlarktest    [no test files]
ok      go.starlark.net/syntax  0.023s
FAIL

The issue comes from the marshal function:

fn.Name() == "marshal"

is never true and thus the type of the return value is starlark.String and not Bytes. My understanding is that fn.Name() == "proto.marshal"

Applying the following diff:

diff --git a/lib/proto/proto.go b/lib/proto/proto.go
index fec3365..f038eb0 100644
--- a/lib/proto/proto.go
+++ b/lib/proto/proto.go
@@ -225,7 +225,7 @@ func marshal(_ *starlark.Thread, fn *starlark.Builtin, args starlark.Tuple, kwar
        if err := starlark.UnpackPositionalArgs(fn.Name(), args, kwargs, 1, &m); err != nil {
                return nil, err
        }
-       if fn.Name() == "marshal" {
+       if fn.Name() == "proto.marshal" {
                data, err := proto.Marshal(m.Message())
                if err != nil {
                        return nil, fmt.Errorf("%s: %v", fn.Name(), err)

makes all tests pass:

go test -mod=readonly ./...
?       go.starlark.net/cmd/starlark    [no test files]
?       go.starlark.net/internal/chunkedfile    [no test files]
ok      go.starlark.net/internal/compile    (cached)
?       go.starlark.net/internal/spell  [no test files]
ok      go.starlark.net/lib/proto   0.028s
?       go.starlark.net/lib/proto/cmd/star2proto    [no test files]
?       go.starlark.net/repl    [no test files]
ok      go.starlark.net/resolve (cached)
ok      go.starlark.net/starlark    (cached)
?       go.starlark.net/starlarkjson    [no test files]
ok      go.starlark.net/starlarkstruct  (cached)
?       go.starlark.net/starlarktest    [no test files]
ok      go.starlark.net/syntax  (cached)

Since this is a branch that is still in review, submitting a PR didn't feel right, hence this comment.

Related: would you be open to a PR to use GitHub actions to run tests on each PR?

alandonovan commented 3 years ago

Thanks for the bug report, and diagnosis and fix. I've updated my PR to use the correct name---it was a last-minute renaming. As you point out, the tests in this CL did catch the problem but I appear to have forgotten to run them.

I would welcome a patch to make them run as part of the GitHub process, but we already have a Travis CI setup https://travis-ci.org/github/google/starlark-go that should do exactly that. Perhaps I just need to expand the list of packages to test? I will investigate.

Please do give the new package a critical evaluation. It is not too late to change anything.

alandonovan commented 3 years ago

@jayconrod ping for review

pims commented 3 years ago

A short experience report on attempting to use the new lib/proto:

I first tried to use star2proto without any modifications. I created foo.proto

syntax = "proto3";

option go_package = "github.com/pims/proto-experiments/search";

message SearchRequest {
  string query = 1;
  int32 page_number = 2;
  int32 result_per_page = 3;
}
# foo.star
foo = proto.file("foo.proto")

result = foo.SearchRequest(
     query = "cute animals",
     page_number = 2,
     result_per_page = 3,
)
$ star2proto --output=json foo.star
star2proto: Traceback (most recent call last):
  foo.star:2:17: in <toplevel>
  <builtin>: in proto.file
Error: proto: not found

I guess I need to use the --descriptors flag. A quick google search and I now have a descriptor set for foo.proto:

 protoc --descriptor_set_out=foo.fds foo.proto
$ star2proto --descriptors=foo.fds --output=json foo.star
{
  "query": "cute animals",
  "pageNumber": 2,
  "resultPerPage": 3
}

Off to a good start 🎉

Reading through the code in proto_test.go, I realize that if I import a go package containing a foo.pb.go and foo.proto, and copy the package directory where my foo.star script currently resides, I can skip the --descriptors flag and get the same output:

--- a/lib/proto/cmd/star2proto/star2proto.go
+++ b/lib/proto/cmd/star2proto/star2proto.go
@@ -18,6 +18,7 @@ import (
        "os"
        "strings"

@@ -29,6 +30,13 @@ import (
        "google.golang.org/protobuf/reflect/protoreflect"
        "google.golang.org/protobuf/reflect/protoregistry"
        "google.golang.org/protobuf/types/descriptorpb"
+
+       _ "gitHub.com/pims/proto-experiments/search"
$ star2proto --output=json foo.star
{
  "query": "cute animals",
  "pageNumber": 2,
  "resultPerPage": 3
}

At this point, I get really excited about all the tooling I can build around Envoy/Kubernetes, inspired by http://github.com/stripe/skycfg/

And this is where I run into some frustrating issues. Let's import a few kubernetes-api packages:

+
+       _ "k8s.io/api/apps/v1"
+       _ "k8s.io/api/core/v1"
+       _ "k8s.io/apimachinery/pkg/apis/meta/v1"
+       _ "k8s.io/apimachinery/pkg/runtime"
+       _ "k8s.io/apimachinery/pkg/runtime/schema"
+       _ "k8s.io/apimachinery/pkg/util/intstr"

Let's clone https://github.com/kubernetes/api to have all *.proto files next to my foo.star script, which I updated to contain the following:

core_v1 = proto.file("kubernetes-api/api/core/v1/generated.proto")

result = core_v1.AWSElasticBlockStoreVolumeSource(
     volumeID="10",
)
├── foo.fds
├── foo.pb.go
├── foo.proto
├── foo.star
└── kubernetes-api
    ├── core
    │   └── v1
    │       └── generated.proto
$ star2proto --output=json foo.star
star2proto: Traceback (most recent call last):
  foo.star:2:21: in <toplevel>
  <builtin>: in proto.file
Error: proto: not found

Let's add some logs in star2proto.go to list the protobuf packages auto-registered after being imported:

@@ -59,6 +67,11 @@ func main() {
        // (very few in star2proto, e.g. descriptorpb itself).
        pool := protoregistry.GlobalFiles

+       pool.RangeFiles(func(fd protoreflect.FileDescriptor) bool {
+               fmt.Println(fd.FullName())
+               return true
+       })

Let's run the same script and see what gets printed:

$ star2proto --output=json foo.star
google.protobuf
❓ why am I not seeing the k8s packages here?
star2proto: Traceback (most recent call last):
  foo.star:2:21: in <toplevel>
  <builtin>: in proto.file
Error: proto: not found

and this is where I'm currently stuck. The kubernetes *.proto files have a larger set of dependencies/imports, so it's possible that I'm missing a _ "k8s.io/missing-package-name/here" import in star2proto.go.

Next step is to regenerate the k8s-api from the proto files, to see if it's a protobuf version mismatch.

On the positive side: syntax = "proto3"; works and oneOf too 😃