stripe / skycfg

Skycfg is an extension library for the Starlark language that adds support for constructing Protocol Buffer messages.
Apache License 2.0
643 stars 54 forks source link

another starlark / proto binding #23

Open alandonovan opened 5 years ago

alandonovan commented 5 years ago

Hi, author of go.starlark.net here. Just wanted to let you know that I have an implementation of Starlark support for protocol messages that we use inside Google. It presents a slightly different API that what you have in skycfg. See the example below for comparison.

I haven't open-sourced it yet because it depends on Google's internal implementation of dynamic message and message reflection. The Go protobuf team is hard at work adding support for these features to the open-source proto.Message API (https://github.com/golang/protobuf/issues/364). Once that's done I will port our starlarkproto package to use it and then make it available.

proto:

syntax = "proto2";

package bugtracker;

enum Status {
  NEW      = 0;
  ASSIGNED = 1;
  ACCEPTED = 2;
  FIXED    = 3;
  WONTFIX  = 5;
}

enum Type {
  BUG     = 0;
  FEATURE = 1;
  OTHER   = 2;
}

message Issue {
  required int64 id = 1;
  optional string category = 2;
  optional Status status = 3;
  optional string reporter = 4;
  optional string assignee = 5;
  optional string verifier = 6;
  repeated string cc = 7;
  optional Type type = 8;
  optional int32 priority = 9;
  optional uint32 severity = 10;
  repeated string note = 11;
  repeated KeyValuePair metadata = 12;

  message Nested {
    optional string x = 1;
    optional Nested y = 2;
  }
  optional Nested nested = 13;
  optional float score = 14;
}

Starlark:

# Calling a message Descriptor instantiates a message.
issue = bugtracker.Issue(
    id = 12345,
    status = bugtracker.Status.ASSIGNED,
    reporter = "adonovan",
    note = ["A note", "Another note"],
    metadata = [bugtracker.KeyValuePair(key = "greeting", value = "hello")],
    ext1 = 1,
    ext2 = 2,
)

# Fields may be updated:
issue.type = bugtracker.Type.FEATURE

# Unset fields act like an immutable default value.
assert.eq(issue.nested.y.y.y.y.y.x, "")

# Calling a descriptor with a single positional argument makes a shallow copy.
issue2 = bugtracker.Issue(issue)
assert.eq(str(issue2), str(issue))
issue2.reporter = "nobody"
assert.eq(issue2.reporter, "nobody")
assert.eq(issue.reporter, "adonovan")  # unchanged

# Submessages can be expressed using dicts.
assert.eq(str(bugtracker.Issue(nested = {"x": "x"})), 'bugtracker.Issue(nested={x="x"})')

# The str function prints the entire structure.
assert.eq(str(issue), 'bugtracker.Issue(id=12345, status=Status.ASSIGNED, reporter="adonovan", type=Type.FEATURE, note=["A note", "Another note"], metadata=[bugtracker.KeyValuePair(key="greeting", value="hello")], ext1=1, ext2=2)')

# marshal, marshal_text encode a message.
data = proto.marshal(issue)
text = proto.marshal_text(issue)
assert.eq(data, "\b\xb9`\x18\x01\"\badonovan@\x01Z\x06A noteZ\fAnother noteb\x11\n\bgreeting\x12\x05hello\xa0\x06\x01\xa8\x06\x02")
assert.eq(text, '''id: 12345
status: ASSIGNED
reporter: "adonovan"
type: FEATURE
note: "A note"
note: "Another note"
metadata {
  key: "greeting"
  value: "hello"
}
[bugtracker.Outer.ext2]: 2
[bugtracker.ext1]: 1
''')

# unmarshal, unmarshal_text decode a message.
msg = proto.unmarshal(bugtracker.Issue, data)
assert.eq(str(msg), str(issue))  # messages are equal
jmillikin-stripe commented 5 years ago

Thanks! I was planning to upstream some of the Protobuf code from Skycfg into upstream once the go-protobuf reflection API landed, but it looks like that won't be necessary.

How stabilized is your API? There are a couple parts of it that are a bit awkward to work with in the context of config data, which is why Skycfg diverges. If possible, I would love to resolve the API differences in favor of a single unified starlarkproto package that both codebases could share[0].

Specifically:

[0] Skycfg would need some hooks to handle weird bits of Kubernetes' pseudo-protos.

alandonovan commented 5 years ago

I've copied the API of the package below; as you can see, it's pretty straightforward. (descpool is Google's internal descriptor pool.)

If an unset field is an immutable default value, then if issue.nested == None becomes difficult/impossible to express. I assume you have a proto.HasField(issue, "nested") or similar helper? I found users without a background in *_pb2.py generated code got very confused when I tried that approach here.

Yes, you need to use proto.has(msg, "fieldname"). Sorry I didn't show that in the example, but it's part of the Starlark module, along with {,un}marshal{,_text}.

Do you also do shallow copies when assigning to fields? The Ruby generated code does and it's caused some confusion here, so Skycfg does a bit of magic to make meta = ObjectMeta(); job = CronJob(meta=meta); meta.name = "hellocron" work.

A proto field is either a scalar, a list (RepeatedField), or a reference to a message. Assigning to a scalar field is trivial. Assigning an iterable to a repeated field causes allocation of a new RepeatedField value and a check that each element of the iterable can be assigned to T. Assigning to an element of a repeated field causes an array element update; RepeatedFields are basically mutable lists of a particular element type. Assigning to a message field x.f=y field causes x.f to alias y. Mutations to one will be observed by the other.

We allow strings to be assigned to enums, so status = bugtracker.Status.ASSIGNED and status = "ASSIGNED" both work. Does your package support this, or if not, would you be willing to have it do so?

Yes, strings, numbers and EnumValueDesciptor values can all be assigned directly to enum-typed message fields.

package starlarkproto

type  Descriptor          struct{Desc descpool.Descriptor}
method (Descriptor) Attr(name string) (starlark.Value, error)
method (Descriptor) AttrNames() []string
method (Descriptor) CallInternal(thread *starlark.Thread, args starlark.Tuple, kwargs []starlark.Tuple) (starlark.Value, error)
method (Descriptor) Field(name string) (starlark.Value, error)
...boilerplate Value methods skipped...

type  EnumDescriptor      struct{Desc descpool.EnumDescriptor}
method (EnumDescriptor) Attr(name string) (starlark.Value, error)
method (EnumDescriptor) AttrNames() []string
method (EnumDescriptor) CallInternal(_ *starlark.Thread, args starlark.Tuple, kwargs []starlark.Tuple) (starlark.Value, error)

type  EnumValueDescriptor struct{Desc descpool.EnumValueDescriptor}
method (EnumValueDescriptor) Attr(name string) (starlark.Value, error)
method (EnumValueDescriptor) AttrNames() []string
method (EnumValueDescriptor) CompareSameType(op syntax.Token, y_ starlark.Value, depth int) (bool, error)

type  FileDescriptor      struct{Desc descpool.FileDescriptor}
method (FileDescriptor) Attr(name string) (starlark.Value, error)
method (FileDescriptor) AttrNames() []string

type  Message             struct{...}
method (*Message) Attr(name string) (starlark.Value, error)
method (*Message) AttrNames() []string
method (*Message) Descriptor() descpool.Descriptor
method (*Message) Marshal() ([]byte, error)
method (*Message) MarshalText(w io.Writer) error
method (*Message) SetField(name string, v starlark.Value) error

type  RepeatedField       struct{...}
method (*RepeatedField) Index(i int) starlark.Value
method (*RepeatedField) Iterate() starlark.Iterator
method (*RepeatedField) Len() int
method (*RepeatedField) SetIndex(i int, v starlark.Value) error

func  SetPool             func(thread *starlark.Thread, pool descpool.Pool)
func  Pool                func(thread *starlark.Thread) descpool.Pool

func  Load                func(thread *starlark.Thread, name string) (starlark.StringDict, error)
func  LoadProtoModule     func() (starlark.StringDict, error)

func  Unmarshal           func(pool descpool.Pool, desc descpool.Descriptor, data []byte) (*Message, error)
func  UnmarshalText       func(pool descpool.Pool, desc descpool.Descriptor, data []byte) (*Message, error)
TassoKarkanisAGMT commented 4 years ago

@alandonovan, any update on this issue? I see that work seems to be progressing on golang/protobuf#364. Is everything on track for open-sourcing your starlark/protobuf bindings? Also, I assume that those bindings will be dynamic (ie. based on descriptors, not on Golang reflection against the generated protobuf code). Could you please confirm that this is correct? Thanks!

emcfarlane commented 4 years ago

I've started to look at applying these API bindings here: https://github.com/afking/starlarkproto . Only have basic message construction but will update to support the full API soon.

fenollp commented 3 years ago

@emcfarlane Thanks! Do you use it somewhere?

@alandonovan Any plans/chance that such bindings be available for Bazel? I'd prefer to write Starlark code rather than pbtxt :D

alandonovan commented 3 years ago

FWIW, the starlarkproto package was committed to go.starlark.net back in November: https://github.com/google/starlark-go/pull/318

If there's something that it should support, but does not, please open an issue at go.starlark.net.