thriftrw / thriftrw-node

A thrift binary encoding library using bufrw
MIT License
57 stars 25 forks source link

Period should not be allowed in identifier names #149

Open Matt-Esch opened 7 years ago

Matt-Esch commented 7 years ago

Following from https://github.com/thriftrw/thriftrw-go/issues/292

We have a bunch of specs that use . in the enum identifier, which is tolerated by thriftrw-node but not thriftrw-go. It is claimed to be ok in the thrift idl docs https://thrift.apache.org/docs/idl, but the implementation of apache thrift doesn't tolerate it, and it is claimed to be an error in the spec.

This loose definition is allowing for the creation of specs that are not compatible across languages.

See: https://github.com/apache/thrift/blob/2df9c20dc76c044e502861a2111b90cbdcbbb957/compiler/cpp/src/thrift/main.cc#L854

https://issues.apache.org/jira/browse/THRIFT-667

peg grammar in thriftrw-node:

IdentifierPart
  = IdentifierStart
  / UnicodeCombiningMark
  / UnicodeDigit
  / UnicodeConnectorPunctuation
  / '\u200C'
  / '\u200D'
  / '.'

cc @Raynos @kriskowal

abhinav commented 7 years ago

Note that periods in annotation names are completely acceptable and we can't drop support for that anyway (go.name, js.type, etc.).

kriskowal commented 7 years ago

@Matt-Esch for services that do not require cross-language compatibility, this would be a breaking change. Do you advise that we roll this out in a major version? We can validate enum names post-parsing as an opt-in (that we advise turned on by default).

We may have enough "advised options" now to warrant a major version to flush.

Matt-Esch commented 7 years ago

We are relying so much on this behavior at this point, it might actually do more harm than good to fix it. It would certainly warrant a major version bump, and it would increase the likelihood of things getting stale.

kriskowal commented 7 years ago

In the interim, you could begin migrating IDLs away from this misbehavior by adding (js.name = 'name.with.dots') annotations and revising the IDL name to nameWithoutDots. That would remain backward compatible for all JavaScript users, which are presumably the only users depending on the current behavior.

We can start linting, and also add warnings to this.

Matt-Esch commented 7 years ago

We have to remember that the consumers of our interface are Java and Objective C, and that any renaming should take into consideration whether or not these implementations respect the annotations when they send JSON down the wire. Presumably for enums it is not a problem as string literals are sent?

kriskowal commented 7 years ago

Yeah, and one of those is not using ThriftRW-node.