Open gruns opened 6 years ago
Why, by default, are capital letters (e.g. for camelCase) not allowed in registered URIs?
because AutobahnPython follows the WAMP spec and enforces strict URIs per default: http://wamp-proto.org/static/rfc/draft-oberstet-hybi-crossbar-wamp.html#rfc.section.5.1.1.2
I seem to remember there are some knobs to change the default behavior, allowing "loose URIs" (see above)
Thanks for chiming in.
While the above rules MUST be followed, following a stricter URI
rule is recommended: URI components SHOULD only contain
lower-case letters, digits and _.
Why does WAMP not permit capital letters in URIs, though? Every major programming language permits capital letters in identifiers, including those of the various WAMP implementations http://wamp-proto.org/implementations/. Why don't WAMP URIs?
The forced change from CamelCaseFunctions
to
camel_case_functions
is confusing and unintuitive, especially
for RPC. Doubly so when the common code formatting for languages
with WAMP implementations is camelCase (Javascript, Java, etc).
Why does WAMP not permit capital letters in URIs, though?
whitespace, funny characters and capital letters don't add any gain, but only hassles.
whitespace, funny characters and capital letters don't add any gain, but only hassles.
Whitespace and funny characters: no. Whitespace and funny characters aren't valid identifier characters in mainstream programming languages (Python, Javascript, Go, etc) and thus I agree and see little reason to allow them in WAMP URIs.
But capital letters? Not only are capital letters valid identifier characters in every single WAMP implementation language enumerated here:
But hell: in Erlang, identifiers have to start with a capital letter.
http://erlang.org/doc/reference_manual/expressions.html#id80984
Thus, support for capital letters engenders an immediate and tangible gain: WAMP URIs, e.g. exported RPC functions, no longer have to needlessly and confusingly mangle camelCase identifiers and function names.
That's exactly the problem I ran headlong into. I have to confusingly register this function
async def constructClientPayloadOnConnect(self, conn):
pass
under a different, snake_case name.
@register('construct_client_payload_on_connect')
async def constructClientPayloadOnConnect(self, conn):
pass
I had one function name. Now I have two. Because of this restriction, the number of function names every developers has to remember and reason about doubles. And I'm hardly the first one to be bitten by this unintuitive restriction:
https://groups.google.com/forum/#!topic/autobahnws/mkjF21Fb8ow
So there's an immediately gain for the allowance of capital letters: camelCase identifiers and functions aren't mangled.
WAMP should encourage consistent code. Unfortunately this restriction does the opposite.
Let's fix it!
Little bump: poor, innocent camelCase functions still suffer needlessly at the hands of this restriction.
Let's fix it!
If you'll merge it, I'm happy to file a pull request that does just that.
Ok, I agree: this issue has been raised more than once - I am reopening this:
Autobahn does have 2 levels of URI checking (internally), but this should be:
The 2 levels of URI checking are "strict" vs "loose" and the respective regular expressions are here: https://github.com/crossbario/autobahn-python/blob/master/autobahn/wamp/message.py#L72
# strict URI check allowing empty URI components
_URI_PAT_STRICT_EMPTY = re.compile(r"^(([0-9a-z_]+\.)|\.)*([0-9a-z_]+)?$")
# loose URI check allowing empty URI components
_URI_PAT_LOOSE_EMPTY = re.compile(r"^(([^\s\.#]+\.)|\.)*([^\s\.#]+)?$")
# strict URI check disallowing empty URI components
_URI_PAT_STRICT_NON_EMPTY = re.compile(r"^([0-9a-z_]+\.)*([0-9a-z_]+)$")
# loose URI check disallowing empty URI components
_URI_PAT_LOOSE_NON_EMPTY = re.compile(r"^([^\s\.#]+\.)*([^\s\.#]+)$")
# strict URI check disallowing empty URI components in all but the last component
_URI_PAT_STRICT_LAST_EMPTY = re.compile(r"^([0-9a-z_]+\.)*([0-9a-z_]*)$")
# loose URI check disallowing empty URI components in all but the last component
_URI_PAT_LOOSE_LAST_EMPTY = re.compile(r"^([^\s\.#]+\.)*([^\s\.#]*)$")
There are 6 regular expression, because there are 2 levels (strict vs loose), and 3 types: plain/exact URI, wildcard URI pattern and prefix URI pattern.
the other option would be to redefine the "strict" pattern:
[0-9a-z_]+
=> [0-9A-Za-z_]+
I think this makes sense anyways, as we do allow a leading digit, which is not a valid identifier in most programming languages.
so above would allow for:
but it still doesn't allow for eg base64:
[^-A-Za-z0-9+/=]|=[^=]|={3,}$
or^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$
source: https://stackoverflow.com/questions/475074/regex-to-parse-or-validate-base64-data
actually, I think a better user experience could be:
#
and .
(URI parts can be anything matching [^\s\.#]*
)actually, I think a better user experience could be:
by default, use the "loose" patterns, which only exclude whitespace, # and . (URI parts can be anything matching [^\s.#]*)
let the use switch to "strict", and have that use a pattern that results in valid identifiers for most programming languages
Sounds great.
This solves the original problem (support for CamelCase) and opens the door for other URIs schemes, too (e.g. base64). For example, if a user wants to use emojis in their URIs, I see no compelling reason to stop them.
How can I help implement this?
+1. Having camelCased uri's would be nice.
Capital letters are not allowed in registered URIs. For example,
@register('example.camelCaseFails')
fails:Raises
Googling, the only related discussion I found was
https://groups.google.com/forum/#!topic/autobahnws/mkjF21Fb8ow
Why, by default, are capital letters (e.g. for camelCase) not allowed in registered URIs?