Closed kailan closed 1 year ago
You are correct that Ruby does not yet support custom options. Hopefully we can add this before too long.
@haberman Awesome! It'll really help out a project of mine.
Any way I can help out with implementing this?
This would be helpful for me as well
This is becoming a more and more common request. I have been thinking through how the implementation ought to work on this. It's slightly tricky to modify the DSL for this, because custom options let you declare the option and use it both in the same file. This is a challenge to accommodate in a clean way.
Any progress on this? @haberman
Ok there are a couple parts to making this work.
First of all we need to support proto2 field presence. Right now the code generator fails immediately if it sees a file with syntax="proto2";
. Step 1 would be a PR that supports proto2 files that don't contain any extensions or custom options. Proto2 "optional" fields should be supported by adding hasbits to the in-memory representation to indicate whether a field is present or not. I'm happy to explain this in more detail for anyone who wants to implement this.
Secondly we need to support proto2 extensions. This means extending the DSL with something like add_extension
and making it so you can read and write extension fields of extended messages with something like msg[my_extension] = 5
.
Finally we need a way of extending the DSL to support custom options. We can talk more about this part once #1 and #2 are done.
I'd be happy to see pull requests for any/all of these, and I'd be happy to help guide people who want to implement this to the right places in the code.
I'd love to work on this but I have one question: Why would proto2 need to be supported? Surely implementing this for proto3 would be enough.
Custom options are defined as extensions of messages descriptor.proto
, which is a syntax="proto2";
file. Extensions don't exist at all in proto3.
So even if your messages are only proto3, we need proto2 support to be able to represent the options themselves.
Ah, makes sense. I'll give it a shot later on :+1:
Any status updates on proto2 support?
Any progress since last year? I might have a few days to push this forward.
Funny. We were thinking of looking at this soon too. cc @zanker
Hah, @djudd-stripe I was actually talking to @nerdrew about this last week and we assumed that Stripe didn't use protobufs. Are you using the "protobuf" gem currently?
I was able to hack something to support defaults in a few days, so I don't think it's too hard to at least get basic protobuf2 support at least. It looks like most of the work is building out the Ruby <-> C bridges, as UPB handles everything already.
I'm still working on proving this out internally to make sure we can feasibly swap to this, but right now we're interested in adding at least basic synta2 support.
@haberman it sounds like on your end, you would be interested in merging syntax2 support to the Ruby gem, as long as it meets code standards and is done properly?
Not a lot of Protobuf at Stripe right now but we are exploring adding more. We're using this code so far but are interested in custom annotations.
@zanker it sounds like you have Proto2 support work in-progress? Anything more useful I can do than wait for you to get it merged, then maybe add custom option support on top (unless you've included that already)?
Thanks for the offer, I don't think there's anything right now. I was able to get a more proper one (using hasbits) working, I need a few days to sort it out and clean/test things and I'll submit a PR.
I'm just starting with the syntax2/default parts initially. If you wanted to add custom option support on top of that then 👍 . Will end up doing it otherwise.
@djudd-stripe This is still very much WIP (hence the random debug statements), but for reference: https://github.com/google/protobuf/compare/master...zanker:zanker/syntax2 is what things are looking like. I don't think wiring in extensions would be too hard overall.
Any progress on this? AFAIK there is no way to work with descriptor.proto
in ruby.
syntax = "proto3";
package foo;
import "google/protobuf/descriptor.proto";
:cry:
Sorry! I left Square back in October and the team who was going to take it over got reshuffled so they're not going to have time to pick it back up.
The PR is at https://github.com/google/protobuf/pull/3606, and all the proto side code is pretty solid from the testing I did. The blocker was the proto syntax, which wasn't narrowed down until after I had already left.
If you want to pick up that part of the PR, I'd be happy to walk you through on whatever part of the PR isn't clear.
Is where any plans to implement custom options in ruby? Custom options looks great for expressions some meta logic and it is so sad what ruby does not include this abilitites.
My primary interest is extending enum options.
It looks like there is support in upb.c
. What is the thinking around what the syntax would look like? If it's straightforward I can put up a PR.
Custom options currently cannot be compiled via protoc2, but can be compiled via protoc3 but options will be skipped in compiled ruby pb file. So it will be cool to provide meta options like this in ruby output file:
syntax = "proto2";
import "google/protobuf/descriptor.proto";
extend google.protobuf.EnumValueOptions {
optional string name = 1001;
}
message Operation {
enum ClassName {
OPERATION = 0 [(name) = 'Operation'];
PUSH_OPERATION = 1 [(name) = 'PushOperation'];
}
}
So, you ask about how it will be implemented in ruby generated classes? Am i understand you right?
@pechorin Thats correct, just looking for guidance on the Ruby syntax/methods for the generated classes. It seems like upb.c
already has support for reading out the options, they would just need to be exposed.
Any news on this? I also wanted to add options to an enum and it works perfectly for go but not ruby 😞
syntax = "proto3";
package some.package;
import "google/protobuf/descriptor.proto";
extend google.protobuf.EnumOptions {
optional int64 default_http_status_code = 500;
}
extend google.protobuf.EnumValueOptions {
optional int64 http_status_code = 501;
}
message Error {
enum Code {
option (default_http_status_code) = 400;
CODE_UNSPECIFIED = 0 [(http_status_code) = 500];
CODE_INVALID = 1;
CODE_EXPIRED = 2;
CODE_NOT_FOUND = 3 [(http_status_code) = 404];
}
Code code = 1;
}
[libprotobuf WARNING ../../third_party/protobuf/src/google/protobuf/compiler/ruby/ruby_generator.cc:512] Omitting proto2 dependency 'google/protobuf/descriptor.proto' from proto3 output file 'banked/consent/v1/errors_pb.rb' because we don't support proto2 and no proto2 types from that file are being used.
Hello, can we please get an update on plans for this? We, too, have a heterogenous stack with Go, Python, and Ruby and are blocked from using this feature because of this issue.
// metadata3.proto
syntax = "proto3";
import "google/protobuf/descriptor.proto";
package metas3;
message Metadata3 {
optional string field1 = 1;
optional string field2 = 2;
optional string field3 = 3;
}
extend google.protobuf.MessageOptions {
Metadata3 metadata3 = 50000;
}
// event.proto
syntax = "proto3";
import "metadata3.proto";
package events;
message Event {
string id = 1;
string name = 2;
string someData = 3;
option (metas3.metadata3).field1 = "placeholder1";
option (metas3.metadata3).field2 = "placeholder2";
}
// No extensions in generated code
# Generated by the protocol buffer compiler. DO NOT EDIT!
# source: metadata3.proto
require 'google/protobuf/descriptor_pb'
require 'google/protobuf'
Google::Protobuf::DescriptorPool.generated_pool.build do
add_file("metadata3.proto", :syntax => :proto3) do
add_message "metas3.Metadata3" do
proto3_optional :field1, :string, 1
proto3_optional :field2, :string, 2
proto3_optional :field3, :string, 3
end
end
end
module Metas3
Metadata3 = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("metas3.Metadata3").msgclass
end
require_relative 'protos_generated/event_pb'
include Events
class MetadataTest
def to_event
Event.new(
id: '1',
name: 'name',
someData: 'someData'
)
end
def verify
to_event = self.to_event
binary_encoding = Event.encode(to_event)
binary_decoding = Event.decode(binary_encoding)
raise "Binary encoding error" unless binary_decoding == to_event
end
end
Running SerDe:
bundle exec ruby -r "./metadata_test.rb" -e "MatadataTest.new.verify"
Traceback (most recent call last):
...
kernel_require.rb:54:in `require': cannot load such file -- google/protobuf/descriptor_pb (LoadError)
...
google-protobuf-3.18.1-universal-darwin/lib/google/protobuf/descriptor_pb.rb:5:in `<top (required)>': uninitialized constant Google (NameError)
Edit: What confuses me, though, is that this error doesn't exist in bundle console
!
bundle console
2.5.5 :001 > load 'metadata_test.rb'
=> true
2.5.5 :002 > MetadataTest.new.verify
=> nil
2.5.5 :003 >
Also, is there any other mechanism in proto3 that we can use to get similar functionality for the time being?
So I have been playing with this more and found that the code generated by protobuf/releases/download/v3.17.3/protoc-3.17.3-linux-x86_64.zip
is different from protobuf/releases/download/v3.17.3/protoc-3.17.3-osx-x86_64.zip
!
The linux one doesn't have the require 'google/protobuf/descriptor_pb'
line, so the above error doesn't exist. Tracking this down has been an incredibly frustrating experience—I hope you choose to do something about it.
This also means that, while extensions are still unavailable in Ruby, at least proto definitions with extensions can be used with Ruby in a multi-lang stack.
I'm sorry this has been frustrating to track down.
I was not able to reproduce your report of getting different output from protoc-3.17.3-linux-x86_64.zip
and protoc-3.17.3-osx-x86_64.zip
. I tried them both just now and did not get the require 'google/protobuf/descriptor_pb'
line in either.
If you are getting require 'google/protobuf/descriptor_pb'
I think you must be using a build of protoc that includes https://github.com/protocolbuffers/protobuf/pull/9003, which was released in https://github.com/protocolbuffers/protobuf/releases/tag/v3.18.1.
I think the root cause of the issue you encountered is a bug in https://github.com/protocolbuffers/protobuf/pull/9003. Specifically, the generated file descriptor_pb.rb
is throwing an error. This error occurs because google/protobuf
has not yet been required:
google-protobuf-3.18.1-universal-darwin/lib/google/protobuf/descriptor_pb.rb:5:in `<top (required)>': uninitialized constant Google (NameError)
We need to change the code generator such that descriptor_pb.rb
starts with the line require 'google/protobuf'
. A short-term workaround would be to manually require 'google/protobuf' before importing any other foo_pb
generated file.
All of this is unrelated to whether Ruby supports custom options or not (just noting this for future readers who come across this bug).
Custom options for Ruby are now within reach; the upb library which powers Ruby supports extensions as of https://github.com/protocolbuffers/upb/pull/421 and custom options as of https://github.com/protocolbuffers/upb/pull/426. Full custom option support for Ruby should be possible now.
Hello @haberman, want to confirm that I was using 3.17.3 linux but 3.18.1 osx, so the bug from #9003 makes sense (and is unfortunate). 3.17.3 is what our official schema repo is using, but I'd recently upgraded my personal setup to the latest protoc. Thank you for pointing this out!
I know that this is a public forum, but would you be comfortable predicting when extensions might work in Ruby? :) Also happy to chat offline.
I have another question about how extensions themselves work. Why do we have to provide a placeholder value for every option in the message using options (example)? This beats the benefit of encapsulating all the custom options in a single message
. If the options message adds a new field, we'd need to update every message using the options before they can use it. This is very counterintuitive. Would love to know the design principle behind this.
I opened a separate issue for the problem with google/protobuf/descriptor_pb
: https://github.com/protocolbuffers/protobuf/issues/9122
This will be fixed in https://github.com/protocolbuffers/protobuf/pull/9121
I misspoke; this was not a bug in https://github.com/protocolbuffers/protobuf/pull/9003, this was a bug in my previous PR https://github.com/protocolbuffers/protobuf/pull/8850
@i0jupiter I'll address your other questions momentarily.
I know that this is a public forum, but would you be comfortable predicting when extensions might work in Ruby? :)
At this point all the underlying technical pieces are in place, we just need to design the API and implement it.
The team is working on other priorities at the moment, and I'm not sure when this will be prioritized. We would welcome a contribution, but the API design could take several rounds of back and forth before we arrive at something we're comfortable with.
Why do we have to provide a placeholder value for every option in the message using options (example)?
I'm not following why you are needing to do this. What is wrong with just not setting the option when you don't have a value for it?
message Event {
string id = 1;
string name = 2;
string someData = 3;
option (metas3.metadata3).field1 = "some_real_value";
// not set because we have no value for it:
// option (metas3.metadata3).field2 = "placeholder2";
}
By the way, you can also use object syntax for this:
message Event {
string id = 1;
string name = 2;
string someData = 3;
option (metas3.metadata3) = {field1: "some_real_value", field2: "another_real_value"};
}
Thanks for the follow-up, @haberman.
Re. setting the values of the options, I'd just assumed that values had to be provided at the usage site—that's how I saw in all the examples in the guide but never got a chance to play with cuz Ruby. Thanks for the tip on the object syntax too! Recommend including both of these in the guide itself :)
I will try to play with Go and Python too for the actual experience.
Understand the contributions part, but I, personally, already have my finger in too many pies. That said, our org is a very heavy user of both schemas and Ruby and this is posing real limitations for us (happy to provide examples offline). Is there anything I can do to add more weight to the usefulness of this feature to help your team prioritize it higher?
@haberman Hey! I've been following along this thread for a while and I might have a personal interest in taking this up, especially given that you mentioned the underlying foundations are ready. What is the best way to receive some guidance from you to get started on this?
@seanhuang-stripe thank you for the interest!
There are two main pieces to supporting custom options:
GenerateBinaryDescriptor()
instead of GenerateDslDescriptor()
, eg. https://github.com/protocolbuffers/protobuf/blob/125dfd343e823e0b16209520d0c6d2c63f713bb2/src/google/protobuf/compiler/ruby/ruby_generator.cc#L525-L531 However that will also make the generated code less readable, since it would no longer use the DSL, which users may object to. I think we could accommodate that readability concern by generating separate RBI files for the generated code. This could get the best of both worlds: small, fast-loading _pb.rb
files that preserve all custom options, and larger, readable _pb.rbi
files that give IDEs, type checkers, and human readers something that represents the fully-expanded interface. cc @JasonLunn who I've discussed this idea with previously.I would be very happy to review a design doc and PRs for this feature.
Marking this as blocked on #9495. Until we can make the generated code even uglier without degrading human readability (by adopting RBS with will allow us to separate those concerns), we don't want to do the work in point 1 above.
This is unblocked now that https://github.com/protocolbuffers/protobuf/issues/10492 is fixed. More info: https://protobuf.dev/news/2023-04-20/
@haberman I am very interested in Custom Options (and descriptor options) for Ruby. I am even willing to help implement them. I had some time last week and started looking into it.
Note: I haven't written any c code in a quite a while. So my proposed initial solution will be focused mostly on the top level c class in the ruby library defs.c
.
Adding support for descriptor options seems relatively strait forward given the helpers in the upb library.
I had a question on what sort of interface would be acceptable for options? There isn't currently a ruby/ubp equivalent for the various options classes, so exposing them from the descriptor classes seems like the easiest lift.
# ruby
d = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("the.greatest.and.best.proto.in.the.world")
d.deprecated?
=> true | false
This interface seems to be a bit different from some of the other languages, but feels more ruby-like to me.
# python-ish
d.getOptions().deprecated
Some of the Options have quite a few fields on them (Looking at you, FileOptions). So if we went with the top level descriptor methods, it could greatly expand the Ruby interface on the descriptors. Given that the underlying Option classes are relatively stable, I personally dont think that is a big concern. But I do think matching the interfaces of the other languages is important.
So, before I go too far down any path, I had a couple of questions:
1. Which interface to options do you think makes the most sense?
descriptor.deprecated?
descriptor.options.deprecated?
descriptor.get_option("deprecated").value
(Some sort of Ruby representation of an Option would need to be returned)descriptor.get_option_value("deprecated")
2. If descriptor.options.deprecated?
is the preferred route: What would be the correct way to add intermediate Ruby wrapper classes for the google_protobuf_XxxxxxOptions
classes?
ubp
has nicely defined structs in the reflection API for the currently exposed Ruby classes (ex: Google::Protobuf::Descriptor
-> upb_MessageDef
, full list). I have personally found these easy to work with when attempting to extend the Ruby interface.
upb_MiniTable
directly make better align with the ubp
design?DescriptorProto -> Descriptor
doesn't work as well with the more inner classes (MessageOptions
-> ?????
)
3. How do you actually expose custom options?
I had trouble figuring this out. It looks like most of the "fields" are looked up by passing a upb_MiniTableField
into the various _upb_MiniTable_GetXxxxxField
methods. I am a little lost on how that should work with custom options.
There are also the methods FindByFieldNumber
and GetFieldByIndex
.
Either of these approaches would require returning to Ruby some sort of Option definition or potentially the raw value without type information.
Thanks in advance for any guidance and all the work to make this possible.
Hi @jsteinberg, thanks for the interest!
Fortunately, I think there is a solution that would be far easier than what you are describing.
Recall that options are just protos. The Ruby Protobuf library already is capable of defining proto message classes without requiring that every field is bound into C manually.
You can already construct a FieldOptions
class in Ruby like so:
require 'google/protobuf/descriptor_pb'
field_options = Google::Protobuf::FieldOptions.new
All we need is a way of doing something like this:
# We could add this maybe in ruby/lib/google/protobuf.rb
class Descriptor
if not @options
@options = Google::Protobuf::MessageOptions.decode(serialized_options)
@options.freeze # We actually need to freeze recursively.
end
@options
end
# Now a user can write something like this:
field = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("pkg.Foo")
field.options.deprecated
The only trick is how to get the serialized_options
above. In C, it would look something like this:
/*
* call-seq:
* Descriptor.serialized_options => options
*
* Returns a binary string containing the serialized options for this message.
*/
static VALUE Descriptor_serialized_options(VALUE _self) {
Descriptor* self = ruby_to_Descriptor(_self);
const google_protobuf_MessageOptions* opts = upb_MessageDef_Options(self->msgdef);
upb_Arena* arena = upb_Arena_New();
size_t size;
char* serialized = google_protobuf_MessageOptions_serialize(opts, arena, &size);
if (serialized) {
VALUE ret = rb_str_new(serialized, size);
rb_enc_associate(ret, rb_ascii8bit_encoding());
upb_Arena_Free(arena);
return ret;
} else {
upb_Arena_Free(arena);
rb_raise(rb_eRuntimeError, "Error encoding");
}
}
That's all it should take! You can repeat this for each kind of option.
Thank you for the quick response. Also, that way is much simpler. I should have some time this week to finish this up and get a PR submitted.
I'm still struggling with reading the custom options. I'm hoping you can point me in the right direction.
Given the following proto, I am able to get deprecated
as you showed.
extend google.protobuf.MessageOptions {
string custom_message_opt = 7739036;
}
message Event {
option deprecated = true;
option (custom_message_opt) = "CUSTOM MESSAGE OPTION";
string id = 1;
}
d = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("Event")
d.serialized_options
=> "\x18\x01\xE2\xE9\xC2\x1D\x15CUSTOM MESSAGE OPTION"
# I'll add to the ruby interface like you suggested.
options = Google::Protobuf::MessageOptions.decode(d.serialized_options)
=> <Google::Protobuf::MessageOptions: deprecated: true, uninterpreted_option: []>
options.deprecated
=> true
options.custom_message_opt
NoMethodError: undefined method `custom_message_opt' for <Google::Protobuf::MessageOptions: deprecated: true, uninterpreted_option: []>:Google::Protobuf::MessageOptions
options.class.encode(options)
=> "\x18\x01\xE2\xE9\xC2\x1D\x15CUSTOM MESSAGE OPTION"
Also, I'm happy to PR your suggested implementation WITHOUT custom options to isolate the diff if that is easier. deprecated
by itself would be nice for some of our workflows.
Thanks again
That is a good question. It looks like we haven't added Ruby APIs to support extensions yet. But the good news is, that should be pretty easy to do also at this point.
We should add the following code to DescriptorPool_lookup()
:
fielddef = upb_DefPool_FindExtensionByName(self->symtab, name_str);
if (fielddef) {
return get_fielddef_obj(_self, enumdef);
}
Then you could write something like this:
ext = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("custom_message_opt")
ext.get(options)
We might also want to make Message.[]
accept a FieldDescriptor, so you could get slightly nicer syntax:
options[ext]
The final thing we would want to do is add these lookup
calls to the generated code, so a user could just reference the extension by name, instead of having to call lookup
first.
But all of that should probably go into a separate PR, which is to add extensions support to Ruby.
By the way, both PRs will need to go through a round of API review internally. But I expect that the APIs I proposed above should be relatively uncontroversial.
I created a custom option extension that I use in my proto files, like this:
It doesn't seem that once it is compiled into Ruby code, I can access the value of the
protapi.http
option. I believe this can be done in other languages by accessing the descriptor.Am I missing something, or is this not possible right now?