Closed black-snow closed 1 year ago
Last year we changed that extension to just be a string and we got rid of the sequenceType stuff exactly because of the reasons you stated - it was just confusing. See: https://github.com/cloudevents/spec/blob/main/cloudevents/extensions/sequence.md for the latest - and yes it means you'll need to pad with zeros. However, notice that you could also use non-integers for the chars to reduce the number of bytes needed.
@duglin thanks for the quick reply. Did you consider string|number
?
I think that would then require some kind of "sequenceType" field again and (if my memory is correct) we decided to just keep it simple and only require strcmp to keep it easy and fast.
Why is that? If it's a number it's a number (well, you shouldn't use floats, I guess) and we know how to sort them - if it's a string it must be ordered lexicographically.
Now you'd have to zero left-pad with unknown length as the length is no longer defined ...
/edit: will go through your links
w/o a "type" you have to first scan the entire string to see if there are any non-int chars in there, or let the str2int logic error and then revert to strcmp.
I suspect people would be willing to reopen the discussion if someone wanted to PR a change that kept it simple :-)
@duglin Are you talking about different serializations? Haha, in my mental model it's all just JSON and I was wondering why you wouldn't have a type.
/edit: Thinking loudly: If some protocols (HTTP) don't support the type system, should this affect the core spec? But well, there's no union types defined so no String|Integer
anyway.
Leading zeros aren't allowed btw, according to RFC 7159. And the "canonical string-encoding" must be supported. So unless we indicate the intended type we can't allow both string and int - I see. Unless we say "try to parse it as int first - if no bueno take it as a string". But that's crap and wasted energy when you know it's a string anyway.
I guess string was chosen in the first place because of TBUUIDs etc.
Having another field just to say how to interpret another is clunky.
Having different versions of the spec just for this case would be insane.
I guess the easiest way for me now would be to roll my own format like always prefixing it with a type indicator. Or, well, just when I want denote an int like i:12345
. Or go with n leading zeros anyway with the excuse that it's somehow a doubly encoded int.
I just realized that the spec says that the sequence
field itself (well, its value) must be "a non-empty lexicographically-orderable string" - not the whole sequence. "[...] a simple string-compare type of operation" could also be "cut off the first n characters and compare the rest".
/edit2: Another way would be to put the (zero-padded) number of digits first. 2147483647 is the max so if I don't care about negative numbers a two-character prefix would suffice. 1
would become 011
, 100
become 03100
and 2147483647
become 102147483647
. Or one character in hex.
Guess, I'll close this for now.
I think in the end you landed at the right spot... it's just a string so issues like "leading zeros aren't allowed per the rfc" don't really matter if you always treat it as a string. Which is why I said you can use non-int chars - which could save you bytes on the wire if that's ever a concern.
I was stumbling over some issue when I evaluated if
sequence
was the right field for revisioning my payloads.The sequence extension requires for
sequence
:But then:
contiguous
seems kinda fuzzy when looking at strings. And worse, the only givensequencetype
is an (for some reason)unsigned
32-bit int, whichBut if we simply turn our numbers into strings like
"1"
,"2"
, ... and10
the lexicographical order would (arguably) be1, 10, 2
, which is not what we want. We'd have to zero left-pad everything like01, 02, 10, ...
but then we'd have to zero-pad to the full legth (always 10 characters in total).So is
sequence
required to be orderable but we'll just ignore the lexicographical order whensequencetype == Integer
because it makes no sense?What if
sequence
was of typestring|number
and just guaranteed a total order?Why do we waste half of the space with a
signed
int? (Who'd use negative seq values, especially if the seq must start at 1?)