protocolbuffers / protobuf

Protocol Buffers - Google's data interchange format
http://protobuf.dev
Other
65.92k stars 15.53k forks source link

No way to use custom options in Ruby #1198

Closed kailan closed 1 year ago

kailan commented 8 years ago

I created a custom option extension that I use in my proto files, like this:

service Fishtank {
  rpc GetFish(FishQuery) returns (Fish) {
    option (protapi.http).get = "/fish/{id}";
  }
}

It doesn't seem that once it is compiled into Ruby code, I can access the value of the protapi.http option. I believe this can be done in other languages by accessing the descriptor.

Am I missing something, or is this not possible right now?

haberman commented 8 years ago

You are correct that Ruby does not yet support custom options. Hopefully we can add this before too long.

kailan commented 8 years ago

@haberman Awesome! It'll really help out a project of mine.

Any way I can help out with implementing this?

skipperguy12 commented 8 years ago

This would be helpful for me as well

haberman commented 8 years ago

This is becoming a more and more common request. I have been thinking through how the implementation ought to work on this. It's slightly tricky to modify the DSL for this, because custom options let you declare the option and use it both in the same file. This is a challenge to accommodate in a clean way.

kailan commented 8 years ago

Any progress on this? @haberman

haberman commented 8 years ago

Ok there are a couple parts to making this work.

First of all we need to support proto2 field presence. Right now the code generator fails immediately if it sees a file with syntax="proto2";. Step 1 would be a PR that supports proto2 files that don't contain any extensions or custom options. Proto2 "optional" fields should be supported by adding hasbits to the in-memory representation to indicate whether a field is present or not. I'm happy to explain this in more detail for anyone who wants to implement this.

Secondly we need to support proto2 extensions. This means extending the DSL with something like add_extension and making it so you can read and write extension fields of extended messages with something like msg[my_extension] = 5.

Finally we need a way of extending the DSL to support custom options. We can talk more about this part once #1 and #2 are done.

I'd be happy to see pull requests for any/all of these, and I'd be happy to help guide people who want to implement this to the right places in the code.

kailan commented 8 years ago

I'd love to work on this but I have one question: Why would proto2 need to be supported? Surely implementing this for proto3 would be enough.

haberman commented 8 years ago

Custom options are defined as extensions of messages descriptor.proto, which is a syntax="proto2"; file. Extensions don't exist at all in proto3.

So even if your messages are only proto3, we need proto2 support to be able to represent the options themselves.

kailan commented 8 years ago

Ah, makes sense. I'll give it a shot later on :+1:

nerdrew commented 7 years ago

Any status updates on proto2 support?

djudd-stripe commented 7 years ago

Any progress since last year? I might have a few days to push this forward.

nerdrew commented 7 years ago

Funny. We were thinking of looking at this soon too. cc @zanker

zanker commented 7 years ago

Hah, @djudd-stripe I was actually talking to @nerdrew about this last week and we assumed that Stripe didn't use protobufs. Are you using the "protobuf" gem currently?

I was able to hack something to support defaults in a few days, so I don't think it's too hard to at least get basic protobuf2 support at least. It looks like most of the work is building out the Ruby <-> C bridges, as UPB handles everything already.

I'm still working on proving this out internally to make sure we can feasibly swap to this, but right now we're interested in adding at least basic synta2 support.

@haberman it sounds like on your end, you would be interested in merging syntax2 support to the Ruby gem, as long as it meets code standards and is done properly?

djudd-stripe commented 7 years ago

Not a lot of Protobuf at Stripe right now but we are exploring adding more. We're using this code so far but are interested in custom annotations.

djudd-stripe commented 7 years ago

@zanker it sounds like you have Proto2 support work in-progress? Anything more useful I can do than wait for you to get it merged, then maybe add custom option support on top (unless you've included that already)?

zanker commented 7 years ago

Thanks for the offer, I don't think there's anything right now. I was able to get a more proper one (using hasbits) working, I need a few days to sort it out and clean/test things and I'll submit a PR.

I'm just starting with the syntax2/default parts initially. If you wanted to add custom option support on top of that then 👍 . Will end up doing it otherwise.

zanker commented 7 years ago

@djudd-stripe This is still very much WIP (hence the random debug statements), but for reference: https://github.com/google/protobuf/compare/master...zanker:zanker/syntax2 is what things are looking like. I don't think wiring in extensions would be too hard overall.

pcj commented 6 years ago

Any progress on this? AFAIK there is no way to work with descriptor.proto in ruby.

syntax = "proto3";
package foo;
import "google/protobuf/descriptor.proto";

:cry:

zanker commented 6 years ago

Sorry! I left Square back in October and the team who was going to take it over got reshuffled so they're not going to have time to pick it back up.

The PR is at https://github.com/google/protobuf/pull/3606, and all the proto side code is pretty solid from the testing I did. The blocker was the proto syntax, which wasn't narrowed down until after I had already left.

If you want to pick up that part of the PR, I'd be happy to walk you through on whatever part of the PR isn't clear.

pechorin commented 5 years ago

Is where any plans to implement custom options in ruby? Custom options looks great for expressions some meta logic and it is so sad what ruby does not include this abilitites.

My primary interest is extending enum options.

ebenoist commented 4 years ago

It looks like there is support in upb.c. What is the thinking around what the syntax would look like? If it's straightforward I can put up a PR.

pechorin commented 4 years ago

Custom options currently cannot be compiled via protoc2, but can be compiled via protoc3 but options will be skipped in compiled ruby pb file. So it will be cool to provide meta options like this in ruby output file:

syntax = "proto2";

import "google/protobuf/descriptor.proto";

extend google.protobuf.EnumValueOptions {
   optional string name = 1001;
}
message Operation {

  enum ClassName {
    OPERATION = 0 [(name) = 'Operation'];
    PUSH_OPERATION = 1  [(name) = 'PushOperation'];
  }
}

So, you ask about how it will be implemented in ruby generated classes? Am i understand you right?

ebenoist commented 4 years ago

@pechorin Thats correct, just looking for guidance on the Ruby syntax/methods for the generated classes. It seems like upb.c already has support for reading out the options, they would just need to be exposed.

krak3n commented 3 years ago

Any news on this? I also wanted to add options to an enum and it works perfectly for go but not ruby 😞

syntax = "proto3";

package some.package;

import "google/protobuf/descriptor.proto";

extend google.protobuf.EnumOptions {
  optional int64 default_http_status_code = 500;
}

extend google.protobuf.EnumValueOptions {
  optional int64 http_status_code = 501;
}

message Error {
  enum Code {
      option (default_http_status_code) = 400;

      CODE_UNSPECIFIED = 0 [(http_status_code) = 500];
      CODE_INVALID = 1;
      CODE_EXPIRED = 2;
      CODE_NOT_FOUND = 3 [(http_status_code) = 404];
  }

  Code code = 1;
}
[libprotobuf WARNING ../../third_party/protobuf/src/google/protobuf/compiler/ruby/ruby_generator.cc:512] Omitting proto2 dependency 'google/protobuf/descriptor.proto' from proto3 output file 'banked/consent/v1/errors_pb.rb' because we don't support proto2 and no proto2 types from that file are being used.
i0jupiter commented 3 years ago

Hello, can we please get an update on plans for this? We, too, have a heterogenous stack with Go, Python, and Ruby and are blocked from using this feature because of this issue.

// metadata3.proto
syntax = "proto3";

import "google/protobuf/descriptor.proto";

package metas3;

message Metadata3 {
  optional string field1 = 1;
  optional string field2 = 2;
  optional string field3 = 3;
}

extend google.protobuf.MessageOptions {
  Metadata3 metadata3 = 50000;
}
// event.proto
syntax = "proto3";

import "metadata3.proto";

package events;

message Event {
  string id = 1;
  string name = 2;
  string someData = 3;
  option (metas3.metadata3).field1 = "placeholder1";
  option (metas3.metadata3).field2 = "placeholder2";
}
// No extensions in generated code
# Generated by the protocol buffer compiler.  DO NOT EDIT!
# source: metadata3.proto

require 'google/protobuf/descriptor_pb'
require 'google/protobuf'

Google::Protobuf::DescriptorPool.generated_pool.build do
  add_file("metadata3.proto", :syntax => :proto3) do
    add_message "metas3.Metadata3" do
      proto3_optional :field1, :string, 1
      proto3_optional :field2, :string, 2
      proto3_optional :field3, :string, 3
    end
  end
end

module Metas3
  Metadata3 = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("metas3.Metadata3").msgclass
end
require_relative 'protos_generated/event_pb'

include Events

class MetadataTest
  def to_event
    Event.new(
      id: '1',
      name: 'name',
      someData: 'someData'
    )
  end

  def verify
    to_event = self.to_event
    binary_encoding = Event.encode(to_event)
    binary_decoding = Event.decode(binary_encoding)
    raise "Binary encoding error" unless binary_decoding == to_event
  end
end

Running SerDe:

bundle exec ruby -r "./metadata_test.rb" -e "MatadataTest.new.verify"

Traceback (most recent call last):
...
kernel_require.rb:54:in `require': cannot load such file -- google/protobuf/descriptor_pb (LoadError)
...
google-protobuf-3.18.1-universal-darwin/lib/google/protobuf/descriptor_pb.rb:5:in `<top (required)>': uninitialized constant Google (NameError)

Edit: What confuses me, though, is that this error doesn't exist in bundle console!

bundle console
2.5.5 :001 > load 'metadata_test.rb'
 => true
2.5.5 :002 > MetadataTest.new.verify
 => nil
2.5.5 :003 >
i0jupiter commented 3 years ago

Also, is there any other mechanism in proto3 that we can use to get similar functionality for the time being?

i0jupiter commented 3 years ago

So I have been playing with this more and found that the code generated by protobuf/releases/download/v3.17.3/protoc-3.17.3-linux-x86_64.zip is different from protobuf/releases/download/v3.17.3/protoc-3.17.3-osx-x86_64.zip!

The linux one doesn't have the require 'google/protobuf/descriptor_pb' line, so the above error doesn't exist. Tracking this down has been an incredibly frustrating experience—I hope you choose to do something about it.

This also means that, while extensions are still unavailable in Ruby, at least proto definitions with extensions can be used with Ruby in a multi-lang stack.

haberman commented 3 years ago

I'm sorry this has been frustrating to track down.

I was not able to reproduce your report of getting different output from protoc-3.17.3-linux-x86_64.zip and protoc-3.17.3-osx-x86_64.zip. I tried them both just now and did not get the require 'google/protobuf/descriptor_pb' line in either.

If you are getting require 'google/protobuf/descriptor_pb' I think you must be using a build of protoc that includes https://github.com/protocolbuffers/protobuf/pull/9003, which was released in https://github.com/protocolbuffers/protobuf/releases/tag/v3.18.1.

I think the root cause of the issue you encountered is a bug in https://github.com/protocolbuffers/protobuf/pull/9003. Specifically, the generated file descriptor_pb.rbis throwing an error. This error occurs because google/protobuf has not yet been required:

google-protobuf-3.18.1-universal-darwin/lib/google/protobuf/descriptor_pb.rb:5:in `<top (required)>': uninitialized constant Google (NameError)

We need to change the code generator such that descriptor_pb.rb starts with the line require 'google/protobuf'. A short-term workaround would be to manually require 'google/protobuf' before importing any other foo_pb generated file.

All of this is unrelated to whether Ruby supports custom options or not (just noting this for future readers who come across this bug).

Custom options for Ruby are now within reach; the upb library which powers Ruby supports extensions as of https://github.com/protocolbuffers/upb/pull/421 and custom options as of https://github.com/protocolbuffers/upb/pull/426. Full custom option support for Ruby should be possible now.

i0jupiter commented 3 years ago

Hello @haberman, want to confirm that I was using 3.17.3 linux but 3.18.1 osx, so the bug from #9003 makes sense (and is unfortunate). 3.17.3 is what our official schema repo is using, but I'd recently upgraded my personal setup to the latest protoc. Thank you for pointing this out!

I know that this is a public forum, but would you be comfortable predicting when extensions might work in Ruby? :) Also happy to chat offline.

I have another question about how extensions themselves work. Why do we have to provide a placeholder value for every option in the message using options (example)? This beats the benefit of encapsulating all the custom options in a single message. If the options message adds a new field, we'd need to update every message using the options before they can use it. This is very counterintuitive. Would love to know the design principle behind this.

haberman commented 3 years ago

I opened a separate issue for the problem with google/protobuf/descriptor_pb: https://github.com/protocolbuffers/protobuf/issues/9122

This will be fixed in https://github.com/protocolbuffers/protobuf/pull/9121

I misspoke; this was not a bug in https://github.com/protocolbuffers/protobuf/pull/9003, this was a bug in my previous PR https://github.com/protocolbuffers/protobuf/pull/8850

@i0jupiter I'll address your other questions momentarily.

haberman commented 3 years ago

I know that this is a public forum, but would you be comfortable predicting when extensions might work in Ruby? :)

At this point all the underlying technical pieces are in place, we just need to design the API and implement it.

The team is working on other priorities at the moment, and I'm not sure when this will be prioritized. We would welcome a contribution, but the API design could take several rounds of back and forth before we arrive at something we're comfortable with.

Why do we have to provide a placeholder value for every option in the message using options (example)?

I'm not following why you are needing to do this. What is wrong with just not setting the option when you don't have a value for it?

message Event {
  string id = 1;
  string name = 2;
  string someData = 3;
  option (metas3.metadata3).field1 = "some_real_value";
  // not set because we have no value for it: 
  // option (metas3.metadata3).field2 = "placeholder2";
}

By the way, you can also use object syntax for this:

message Event {
  string id = 1;
  string name = 2;
  string someData = 3;
  option (metas3.metadata3) = {field1: "some_real_value", field2: "another_real_value"};
}
i0jupiter commented 3 years ago

Thanks for the follow-up, @haberman.

Re. setting the values of the options, I'd just assumed that values had to be provided at the usage site—that's how I saw in all the examples in the guide but never got a chance to play with cuz Ruby. Thanks for the tip on the object syntax too! Recommend including both of these in the guide itself :)

I will try to play with Go and Python too for the actual experience.

Understand the contributions part, but I, personally, already have my finger in too many pies. That said, our org is a very heavy user of both schemas and Ruby and this is posing real limitations for us (happy to provide examples offline). Is there anything I can do to add more weight to the usefulness of this feature to help your team prioritize it higher?

seanhuang-stripe commented 2 years ago

@haberman Hey! I've been following along this thread for a while and I might have a personal interest in taking this up, especially given that you mentioned the underlying foundations are ready. What is the best way to receive some guidance from you to get started on this?

haberman commented 2 years ago

@seanhuang-stripe thank you for the interest!

There are two main pieces to supporting custom options:

  1. Extending the generated code to embed a serialized descriptor, so that all custom options are plumbed to the upb layer. The simplest solution to this would be to simply have all protos use GenerateBinaryDescriptor() instead of GenerateDslDescriptor(), eg. https://github.com/protocolbuffers/protobuf/blob/125dfd343e823e0b16209520d0c6d2c63f713bb2/src/google/protobuf/compiler/ruby/ruby_generator.cc#L525-L531 However that will also make the generated code less readable, since it would no longer use the DSL, which users may object to. I think we could accommodate that readability concern by generating separate RBI files for the generated code. This could get the best of both worlds: small, fast-loading _pb.rb files that preserve all custom options, and larger, readable _pb.rbi files that give IDEs, type checkers, and human readers something that represents the fully-expanded interface. cc @JasonLunn who I've discussed this idea with previously.
  2. Extending the runtime to surface the custom options. This would be reasonably straightforward I think, you could adapt the Python implementation which currently supports this eg. https://github.com/protocolbuffers/upb/blob/0e8772fc20e5a0a2fa1f326c79d494374871ef94/python/descriptor.c#L103-L106 There is a question of whether we should return a new instance each time or whether we should return a shared instance that is frozen, we can figure out those details once the basic implementation is in place.

I would be very happy to review a design doc and PRs for this feature.

JasonLunn commented 2 years ago

Marking this as blocked on #9495. Until we can make the generated code even uglier without degrading human readability (by adopting RBS with will allow us to separate those concerns), we don't want to do the work in point 1 above.

haberman commented 1 year ago

This is unblocked now that https://github.com/protocolbuffers/protobuf/issues/10492 is fixed. More info: https://protobuf.dev/news/2023-04-20/

jsteinberg commented 1 year ago

@haberman I am very interested in Custom Options (and descriptor options) for Ruby. I am even willing to help implement them. I had some time last week and started looking into it.

Note: I haven't written any c code in a quite a while. So my proposed initial solution will be focused mostly on the top level c class in the ruby library defs.c.

Adding support for descriptor options seems relatively strait forward given the helpers in the upb library.

I had a question on what sort of interface would be acceptable for options? There isn't currently a ruby/ubp equivalent for the various options classes, so exposing them from the descriptor classes seems like the easiest lift.

# ruby
d = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("the.greatest.and.best.proto.in.the.world")

d.deprecated?
=> true | false

This interface seems to be a bit different from some of the other languages, but feels more ruby-like to me.

# python-ish
d.getOptions().deprecated

Some of the Options have quite a few fields on them (Looking at you, FileOptions). So if we went with the top level descriptor methods, it could greatly expand the Ruby interface on the descriptors. Given that the underlying Option classes are relatively stable, I personally dont think that is a big concern. But I do think matching the interfaces of the other languages is important.

So, before I go too far down any path, I had a couple of questions:

1. Which interface to options do you think makes the most sense?

2. If descriptor.options.deprecated? is the preferred route: What would be the correct way to add intermediate Ruby wrapper classes for the google_protobuf_XxxxxxOptions classes?

ubp has nicely defined structs in the reflection API for the currently exposed Ruby classes (ex: Google::Protobuf::Descriptor -> upb_MessageDef, full list). I have personally found these easy to work with when attempting to extend the Ruby interface.

DescriptorProto -> Descriptor doesn't work as well with the more inner classes (MessageOptions -> ?????)

3. How do you actually expose custom options? I had trouble figuring this out. It looks like most of the "fields" are looked up by passing a upb_MiniTableField into the various _upb_MiniTable_GetXxxxxField methods. I am a little lost on how that should work with custom options.

There are also the methods FindByFieldNumber and GetFieldByIndex.

Either of these approaches would require returning to Ruby some sort of Option definition or potentially the raw value without type information.

Thanks in advance for any guidance and all the work to make this possible.

haberman commented 1 year ago

Hi @jsteinberg, thanks for the interest!

Fortunately, I think there is a solution that would be far easier than what you are describing.

Recall that options are just protos. The Ruby Protobuf library already is capable of defining proto message classes without requiring that every field is bound into C manually.

You can already construct a FieldOptions class in Ruby like so:

require 'google/protobuf/descriptor_pb'

field_options = Google::Protobuf::FieldOptions.new

All we need is a way of doing something like this:

# We could add this maybe in ruby/lib/google/protobuf.rb
class Descriptor
  if not @options
    @options = Google::Protobuf::MessageOptions.decode(serialized_options)
    @options.freeze   # We actually need to freeze recursively.
  end
  @options
end

# Now a user can write something like this:
field = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("pkg.Foo")
field.options.deprecated

The only trick is how to get the serialized_options above. In C, it would look something like this:

/*
 * call-seq:
 *     Descriptor.serialized_options => options
 *
 * Returns a binary string containing the serialized options for this message.
 */
static VALUE Descriptor_serialized_options(VALUE _self) {
  Descriptor* self = ruby_to_Descriptor(_self);
  const google_protobuf_MessageOptions* opts = upb_MessageDef_Options(self->msgdef);
  upb_Arena* arena = upb_Arena_New();
  size_t size;
  char* serialized = google_protobuf_MessageOptions_serialize(opts, arena, &size);
  if (serialized) {
    VALUE ret = rb_str_new(serialized, size);
    rb_enc_associate(ret, rb_ascii8bit_encoding());
    upb_Arena_Free(arena);
    return ret;
  } else {
    upb_Arena_Free(arena);
    rb_raise(rb_eRuntimeError, "Error encoding");
  }
}

That's all it should take! You can repeat this for each kind of option.

jsteinberg commented 1 year ago

Thank you for the quick response. Also, that way is much simpler. I should have some time this week to finish this up and get a PR submitted.

I'm still struggling with reading the custom options. I'm hoping you can point me in the right direction.

Given the following proto, I am able to get deprecated as you showed.

extend google.protobuf.MessageOptions {
  string custom_message_opt = 7739036;
}

message Event {
  option deprecated = true;
  option (custom_message_opt) = "CUSTOM MESSAGE OPTION";

  string id = 1;
}
d = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("Event")

d.serialized_options
=> "\x18\x01\xE2\xE9\xC2\x1D\x15CUSTOM MESSAGE OPTION" 

# I'll add to the ruby interface like you suggested.
options = Google::Protobuf::MessageOptions.decode(d.serialized_options)
=> <Google::Protobuf::MessageOptions: deprecated: true, uninterpreted_option: []>

options.deprecated
=> true

options.custom_message_opt
NoMethodError: undefined method `custom_message_opt' for <Google::Protobuf::MessageOptions: deprecated: true, uninterpreted_option: []>:Google::Protobuf::MessageOptions

options.class.encode(options)
=> "\x18\x01\xE2\xE9\xC2\x1D\x15CUSTOM MESSAGE OPTION"

Also, I'm happy to PR your suggested implementation WITHOUT custom options to isolate the diff if that is easier. deprecated by itself would be nice for some of our workflows.

Thanks again

haberman commented 1 year ago

That is a good question. It looks like we haven't added Ruby APIs to support extensions yet. But the good news is, that should be pretty easy to do also at this point.

We should add the following code to DescriptorPool_lookup():

  fielddef = upb_DefPool_FindExtensionByName(self->symtab, name_str);
  if (fielddef) {
    return get_fielddef_obj(_self, enumdef);
  }

Then you could write something like this:

ext = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("custom_message_opt")
ext.get(options)

We might also want to make Message.[] accept a FieldDescriptor, so you could get slightly nicer syntax:

options[ext]

The final thing we would want to do is add these lookup calls to the generated code, so a user could just reference the extension by name, instead of having to call lookup first.

But all of that should probably go into a separate PR, which is to add extensions support to Ruby.

By the way, both PRs will need to go through a round of API review internally. But I expect that the APIs I proposed above should be relatively uncontroversial.