I did a read-through this morning, and typed up my thoughts below!
Hashes
The spec notes that 32-byte blake2b hashes. blake2b can do 32-64 byte digests. What fueled the decision for 32 bytes?
Would it make sense to have the spec permit a variable blake2b hash length? (32+ bytes) Thinking about cable in a very Deep Future sense, whether being able to use more bits in our hashes would (as I understand it) help offset faster hardware in the years ahead being able to reverse hashes more easily.
Timestamps
what are the consequences of mal-intentioned people being able to post messages to the past?
what are the consequences of mal-intentioned people being able to post messages to the future?
how do we want clients to handle these cases?
channel time range Request, channel state Request
As I understand it, this means, if a user is in 30 channels, they'd be sending up to 30 * 2 * numPeers of these requests to track all of the channels they're in. They'd also have up to that many incoming requests when they join the swarm. I'm a bit worried this'll make joining the swarm pretty heavy & network intensive, and I'm still thinking about if there are alternatives that'd make sense.
I also wasn't clear on whether the channel time range request will return info/state messages as they occur? Like, when someone joins a channel, will you get that post from both requests (if they're awaiting live data)?
Nitpick
s/length of the channel in/length of the channel name in
String Encoding
I think it would make sense to specify that strings are UTF-8 encoded across the board, e.g. channel names. If we allow arbitrary binary data here, clients wouldn't know necessarily how to interpret them to present them to the user.
channel state Request
This mentions that deletes get sent. Would ALL historic deletes be included every time the req is made? This could be LOTS, e.g. 1000s, for an old channel where someone decided to erase their whole history.
channel list Request
What do y'all think about having an offset param in addition to limit? I'm having a hard time thinking up a use-case where you'd want limit without offset -- you can't paginate.
Also thinking whether we'd want to specify that impls should try to produce a stable sort of channels to return to this query. It would be unhelpful imo to get different results every time you queried the list from a given peer.
"post"
Suggestion: specify what a "post" is at top of this section.
Also, "post" == "data chunk" == "message"? They're used semi-interchangably in the document. I think it'd be great if we could firm up on the terminology and make sure we use it the same way everywhere.
link
How does an impl choose a "latest" message? Biggest timestamp? Choose randomly between heads?
Blocks, Hids
The spec mentions 'blocks' and 'hides' but doesn't specify their meaning. I like that we're putting hides & blocks on the table (they'd be PERFECT for filtering out unwanted users' content from queries), but if we're talking about it here then it sounds like something the spec should explicitly address. Otherwise we should leave it out entirely.
post/topic
This should specify that topic is a string, unless some reason to allow non-string data here?
post/join
I was unclear on what "Peers can obtain a link to anchor their join message by requesting a list of channels." means? Does it mean you can get the latest msgs from a channel first, to find the latest, so you have s/t to link to? What happens if a post/join is made /w no link set?
I think it'd be great if post/delete could specify many hashes. This could reduce sync times & data storage space significantly for someone who wishes to delete a large amount of their message history.
I did a read-through this morning, and typed up my thoughts below!
Hashes
The spec notes that 32-byte blake2b hashes. blake2b can do 32-64 byte digests. What fueled the decision for 32 bytes?
Would it make sense to have the spec permit a variable blake2b hash length? (32+ bytes) Thinking about cable in a very Deep Future sense, whether being able to use more bits in our hashes would (as I understand it) help offset faster hardware in the years ahead being able to reverse hashes more easily.
Timestamps
channel time range
Request,channel state
RequestAs I understand it, this means, if a user is in 30 channels, they'd be sending up to
30 * 2 * numPeers
of these requests to track all of the channels they're in. They'd also have up to that many incoming requests when they join the swarm. I'm a bit worried this'll make joining the swarm pretty heavy & network intensive, and I'm still thinking about if there are alternatives that'd make sense.I also wasn't clear on whether the
channel time range
request will return info/state messages as they occur? Like, when someone joins a channel, will you get that post from both requests (if they're awaiting live data)?Nitpick
s/length of the channel in/length of the channel name in
String Encoding
I think it would make sense to specify that strings are UTF-8 encoded across the board, e.g. channel names. If we allow arbitrary binary data here, clients wouldn't know necessarily how to interpret them to present them to the user.
channel state
RequestThis mentions that deletes get sent. Would ALL historic deletes be included every time the req is made? This could be LOTS, e.g. 1000s, for an old channel where someone decided to erase their whole history.
channel list
RequestWhat do y'all think about having an
offset
param in addition tolimit
? I'm having a hard time thinking up a use-case where you'd wantlimit
withoutoffset
-- you can't paginate.Also thinking whether we'd want to specify that impls should try to produce a stable sort of channels to return to this query. It would be unhelpful imo to get different results every time you queried the list from a given peer.
"post"
Suggestion: specify what a "post" is at top of this section.
Also, "post" == "data chunk" == "message"? They're used semi-interchangably in the document. I think it'd be great if we could firm up on the terminology and make sure we use it the same way everywhere.
link
How does an impl choose a "latest" message? Biggest timestamp? Choose randomly between heads?
Blocks, Hids
The spec mentions 'blocks' and 'hides' but doesn't specify their meaning. I like that we're putting hides & blocks on the table (they'd be PERFECT for filtering out unwanted users' content from queries), but if we're talking about it here then it sounds like something the spec should explicitly address. Otherwise we should leave it out entirely.
post/topic
This should specify that
topic
is a string, unless some reason to allow non-string data here?post/join
I was unclear on what "Peers can obtain a link to anchor their join message by requesting a list of channels." means? Does it mean you can get the latest msgs from a channel first, to find the latest, so you have s/t to link to? What happens if a
post/join
is made /w no link set?