Open matrixbot opened 8 years ago
Jira watchers: @richvdh
Hrm; there are encoding difficulties here.
Some of these IDs end up in JSON strings, which means that they must be interpreted as a sequence of unicode characters - they are not just byte sequences. Likewise, because our URIs are %-encoded UTF-8, having opaque byte sequences in our URIs would require part of a URI to be parsed as UTF-8, and part as 8-bit data, which most URI parsers would not be happy with.
As I see it there are two options here:
\uXXXX
sequences in the JSON response to POST /user/$id/filter
, and then encode it as %-encoded UTF-8 in subsequent URI parameters.Postel's law should guide us here. My inclination is to restrict these IDs to unreserved URI characters (ie, \[A-Za-z0-9._~-]
: see RFC3986) - but also to recommend that, if you receive such an ID, you parse it as a unicode string and re-encode it correctly when sending it on. This has the advantage that if you're writing a hacky bash script, you don't need to worry about escaping at all, whilst those creating IDs can still use base-64 to encode whatever they want.
-- @richvdh
*
is used as a wildcard for device id, so must be forbidden as a device id.
-- @richvdh
Since the links are hard to find above:
Proposals:
Other tracking issues:
"Grammar" might be too strong a word, but we should probably make explicit that the following IDs are entirely implementation-specific byte sequences. The originators are allowed to create them however they like, and the recipient has to send them back as they arrived.
Call IDs (as exposed infixed by MSC2746; now specced at https://spec.matrix.org/v1.7/client-server-api/#grammar-for-voip-idsm.call...
events)(Imported from https://matrix.org/jira/browse/SPEC-388)
(Reported by @richvdh)