Closed mcenkar closed 2 years ago
Thanks for the report @mcenkar. Unfortunately, my time-frame for a fix is very far out. PR's welcome of course, or else, if possible, you might try renaming the field to avoid Java keywords.
I can't recall why avrohugger adopted avro's field mangling strategy (it seems likely that only Specific format would need mangling, but I'm not certain): https://github.com/apache/avro/blob/9a378ffab2b49d95edcd2dba36691096301b64ee/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificData.java#L97
related: https://github.com/julianpeeters/sbt-avrohugger/issues/85
Hi,
so had a quick look and this file:
@namespace( "foo.bar" )
protocol AProtocol {
record A {
union { null, string } return = null;
union { null, int } abstract = null;
union { null, int } assert = null;
union { null, int } break = null;
union { null, int } byte = null;
union { null, int } case = null;
union { null, int } catch = null;
union { null, int } char = null;
union { null, int } class = null;
union { null, int } const = null;
union { null, int } continue = null;
union { null, int } default = null;
union { null, int } do = null;
union { null, int } else = null;
union { null, int } extends = null;
union { null, int } final = null;
union { null, int } finally = null;
union { null, int } for = null;
union { null, int } goto = null;
union { null, int } if = null;
union { null, int } implements = null;
union { null, int } instanceof = null;
union { null, int } interface = null;
union { null, int } native = null;
union { null, int } new = null;
union { null, int } package = null;
union { null, int } private = null;
union { null, int } protected = null;
union { null, int } public = null;
union { null, int } short = null;
union { null, int } static = null;
union { null, int } strictfp = null;
union { null, int } super = null;
union { null, int } switch = null;
union { null, int } synchronized = null;
union { null, int } this = null;
union { null, int } throw = null;
union { null, int } transient = null;
union { null, int } try = null;
union { null, int } volatile = null;
union { null, int } while = null;
}
}
Compiles just fine using java -jar avro-tools-1.11.0.jar idl src/main/avro/a.avdl ./lol.avpr
to:
{
"protocol" : "AProtocol",
"namespace" : "foo.bar",
"types" : [ {
"type" : "record",
"name" : "A",
"fields" : [ {
"name" : "return",
"type" : [ "null", "string" ],
"default" : null
}, {
"name" : "abstract",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "assert",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "break",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "byte",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "case",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "catch",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "char",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "class",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "const",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "continue",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "default",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "do",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "else",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "extends",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "final",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "finally",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "for",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "goto",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "if",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "implements",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "instanceof",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "interface",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "native",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "new",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "package",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "private",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "protected",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "public",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "short",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "static",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "strictfp",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "super",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "switch",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "synchronized",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "this",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "throw",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "transient",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "try",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "volatile",
"type" : [ "null", "int" ],
"default" : null
}, {
"name" : "while",
"type" : [ "null", "int" ],
"default" : null
} ]
} ],
"messages" : { }
}
Those properties are coming from your RESERVED_WORDS. Only some of them don't compile:
boolean
double
enum
false
float
import
int
long
null
throws
true
void
For example boolean
:
Exception in thread "main" org.apache.avro.compiler.idl.ParseException: Encountered " "boolean" "boolean "" at line 45, column 25.
Was expecting one of:
<IDENTIFIER> ...
"@" ...
"`" ...
at org.apache.avro.compiler.idl.Idl.generateParseException(Idl.java:1729)
What's your take on:
return$
-> `return`
)Found place where 2nd set is blocked: https://github.com/apache/avro/blob/release-1.11.0/lang/java/compiler/src/main/javacc/org/apache/avro/compiler/idl/idl.jj#L231-L265
Summary:
boolean
, or throws
you don't really have to handle, it's not valid avro anyway,`return`
which I need works better with ticks, where $ breaks it,public class A extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord {
private java.lang.CharSequence return$;
private java.lang.Integer abstract$;
private java.lang.Integer assert$;
private java.lang.Integer break$;
But in scala we can always do backticks. So probably best way is to do backticks around scala keywords (different set than avro), and just drop $ mangle?
names like boolean, or throws you don't really have to handle, it's not valid avro anyway,
return
which I need works better with ticks, where $ breaks it, do you remember any case where $ is needed?
IIRC, the $ was needed for java avro interop. I wonder what happens at runtime? I suspect org.apache.avro.specific
might require the $ at runtime? And it also looks like vulcan is using org.apache.avro.generic
under the hood.
My other concern is breaking changes.
So I might end up favoring the addition of some configuration, maybe for Standard format at least.
Hi,
I had a quick look and I think we don't need to do any config for this. It would be breaking change, but it would break a bug that's there right now. I added PR here, right now it's failing.
The only places where I found we may need $ is scavro interop, but that's actually using different mangling strategy: list of keywords, and here calling avro-compiler.
So in this PR instead I switched to mangling by backticks, and doing that only on scala keywords, as we don't need to do that on Java keywords.
I didn't test thorougly scavro part but everything still looks OK, and it fixes my problem.
(edit: Also #85 would be fixed)
Happy to hear your feedback.
Cheers
Awesome, thanks very much for the eyes and the fix. Released as avrohugger 1.0.0 and sbt-avrohugger 2.0.0
Hi,
Given:
addSbtPlugin("com.julianpeeters" % "sbt-avrohugger" % "2.0.0-RC17")
It gets compiled to (correct and working):
addSbtPlugin("com.julianpeeters" % "sbt-avrohugger" % "2.0.0-RC18")
It gets compiled to:
In my case I eventually get following exception:
Only workaround I've found is sticking to
RC17
version.