Closed iMacTia closed 2 years ago
Hmm... I feel like this is something that probably should be configurable. I can see there being value in having a top-level enum constant (e.g. when you know that the enums actually are identical). On the other hand, when they aren't identical, it might be better to generate the enum class inside the top-level class instead and therefore it would be namespaced (so you'd get Schema1::RadiusUnit
and Schema2::RadiusUnit
.
The problem is that it's kind of hard to configure this behavior on a per-schema basis. Probably to start with we'd want it to be either on or off.
I totally appreciate this is a complex issue to solve, and I'm happy to help brainstorm about it. Below I'm pouring my thoughts to spark some discussion about possible ways forward.
The current solution still generates duplicated classes for each time the enum is used, but it also introduces an issue when the classes are different despite being called the same way. Moreover, this is also an issue with top-level classes, as nothing technically stops you from having two schemas called the same, as far as they're under separate namespaces. The principle of least surprise would advocate for defaulting to the namespaced solution, as that would avoid the case where a naming conflict ends up causing an unexpected class override (which would be hard to spot and would be allowed by the Ruby interpreter despite causing issues down the line).
That said, I also understand two possible holdbacks here:
com.company.sub-company.namespace.class
) and having this level of nesting on a Ruby class can be annoyingA possible solution to (1) could be to allow for a common_prefix
config, which allows to exclude part of the namespace from the generated classes. So for example, com.company.sub-company.namespace.my-schema
today would generate a Schemas::MySchema
class, but by configuring common_prefix: 'com.company.sub-company'
the generated class would be Schemas::Namespace::MySchema
or Namespace::Schemas::MySchema
.
Issue (2) is a different issue, but would need to be addressed only if/when we decide to make this new configuration the default, which I suggest doing but would definitely require a v2.0 release. The configuration can be added as non-default in 1.x versions
I like this idea! I'm wondering if we can even make it relatively smarter and just automatically clip everything but the last part of the namespace. At least the way I've been using schemas, it's almost always the last part that's important. But I'm good with the idea of a common prefix - or even a namespace mapping, where you could do something like this:
config.schema_namespace_map = {
'com.company.subcompany.namespace' => 'Namespace',
'com.company.subcompany2.namespace' => 'SubCompany2',
'com.whatever' => nil # no namespace
}
We could take a convention-over-configuration approach (latest config override the previous ones):
common_prefix
optionconfig.schema_namespace_map
And this behaviour would be disabled by default, and activated by setting config.schema.namespace_generated_class = true
. Then in a future major release you may decide to make this the new default.
If that sounds good, we could have this split into 2 or even three steps, to avoid dropping too many changes at once. I can see points 1 and 2 being done first (together or individually), and then step 3 (namespace mapping) can be introduced later down the line based on feedback.
How does that sound?
That sounds great to me! I agree with the approach.
@iMacTia have you had a chance to look into this yet? We're starting to use this feature more and definitely feeling the pain of how we have it set up now. :) I can grab this but wanted to know if you'd started work yet.
@dorner this is still sitting on my todo list, wrestling with a bunch of other stuff I need to prioritize. I need to drive the acquisition of Kafka across the company before I can justify spending more time on this. Although I'd love to work on this, I can't really say when that will happen, so I'm happy with you tackling this if you have some time to work on it 😃
Fixed with #154.
I'm trying to understand what's the best way to manage schemas with similar/identical sub-schemas. For example, imagine to have 2 (or more!) separate schemas, both of which use an enum like the following:
Now if I run
bundle exec rake deimos:generate_schema_classes
, I'll get a different class per schema, each including the sub-schema for the enum:Since all schemas are created inside the same namespace (
Schemas
), that means the two auto-generated Enum schema classes will have the exact Ruby class path (Schemas::RadiusUnit
). Effectively, this is causing one to reopen and override the other when Ruby loads both classes.That is already an issue by itself (needless repetition), but it could get even worse if you consider the case where the two enums are called the same, but not exactly the same. Imagine the second schema has a different
symbols
value, like"symbols": ["kilometers", "miles"]
.Now we end up with 2 ruby classes called exactly the same, but with a different implementation of the
symbols
method. And since the last loaded class overrides the former(s), that means one of the two classes is now inconsistent with the corresponding schema, causing it to prevent sending messages.cc @DeeChau