Amending reservations against DEP 0006

martinheidegger commented 5 years ago

To me this DEP is quite problematic. It has the potential to be of major impact on how we understand DAT and can degrade the advantages given by DAT a lot, depending on how it is used in practice. This PR adds my reservations against this proposal.

pfrazee commented 5 years ago

Could servers return different data depending on the session-data? That would create a detectable hypercore split, wouldn’t it?

martinheidegger commented 5 years ago

I can think of only three ways a server can react to input, if neither of this is done: the server doesn't need session data imo.

delay/increase data throughput (prioritizing)
restricting access to immutable data (HAVE respond with different sections.)
change the response.

Changing the HAVE sections depending on input response seems like the obvious thing to attempt and it can be done by simply returning different hypercore data to different sockets. If the client doesn't try to reshare the DAT (which is the default in beaker), no problem will occur. I believe it should be possible for two entirely different trees to be returned.

bnewbold commented 5 years ago

I agree that this DEP changes the semantics of what the Dat/hypercore protocol is used for, and that is can have a "major impact on how we understand DAT", but the history here is that these changes were already happening (with Cabal), and this DEP was an attempt to adapt to those changes already happening.

servers can return different data depending on the session-data as such we must expect that they will

I don't see this as a strong argument against this DEP, as "servers" can already do or return whatever they want, based on things like "client" IP address, latency, peer ID, etc.

I think it would be pretty clearly against the dat/hypercore semantics to return different hypercore feed content (to be specific, feed entries with the same index number but different hashes, thus splitting/forking the feed) to different users (or user agents). While technically correct that "If the client doesn't try to reshare the DAT (which is the default in beaker), no problem will occur", this would be extremely fragile and I think against other presumptions in the ecosystem. We might be misunderstanding what you mean by "data per session" though.

I think this DEP is enabling three specific things:

whether to return any feed content at all (aka, access control), either for the first feed connected, or any ancillary feeds (additional channels). This is related to your second bullet point, but at the FEED/channel level, not the HAVE/entry level
whether the "server" should accept or request (via FEED message) additional feeds, for which the connecting "client" has the signing key. AKA, to allow the "client" to "push" content (entire feeds) to the "server"
allowing the "server" to keep track of connection state, which is sort of a side-band to some use cases of Dat (synchronizing data and static content), but important for real-time/interactive/collaborative use cases

Note: I continued using the client/server language this thread was started will, but we usually call all computers "peers" in the network.

martinheidegger commented 5 years ago

I don't see this as a strong argument against this DEP, as "servers" can already do or return whatever they want, based on things like "client" IP address, latency, peer ID, etc.

In practice I do not believe that IP address, latency or peer-id is reliably to identify a client. Particularly if the use of proxies or backups becomes widespread. On top of that user-data basically practically begs for different response per user.

This would be extremely fragile and I think against other presumptions in the ecosystem.

Yes, totally agree: an implementation would break the DAT ecosystem, that is one of the reason I added my reservations: "because we can" ruins many systems: I believe that is the case here: server implementer will use this approach because they can.

whether to return any feed content at all (aka, access control), either for the first feed connected, or any ancillary feeds (additional channels).

I am second-guessing here if this will work in practice: once a user downloaded a feed, the client-software might very-well immediately share it on the network (and giving access to all other peers).

The only case where this will work is if the client does-not re-share the DAT. Particularly in that case: different feeds - or personalized data - becomes more likely/viable.

Note: I continued using the client/server language this thread was started will, but we usually call all computers "peers" in the network.

I used the "server"/"client" distinction here because I can not imagine a user facing software that can properly manage session-data. Also I am wondering if this DEP doesn't mean that there will be actually DAT servers & clients in future.

pfrazee commented 5 years ago

I used the "server"/"client" distinction here because I can not imagine a user facing software that can properly manage session-data. Also I am wondering if this DEP doesn't mean that there will be actually DAT servers & clients in future.

The example use-case for this DEP was for peers to identify themselves in a chat room.

martinheidegger commented 5 years ago

The example use-case for this DEP was for peers to identify themselves in a chat room.

I assume this is for the use-case that a user should be shown as "online/offline": How would this be implemented for the whole chat room? The user is connected to a subset of peers for this chat-room: By sending an identification to his direct peers, the further-away-peers will not be notified: showing this user offline even though he is actually connected to the chat. It seems to me like "online/offline" becomes a state that should be shared between the peers of the chat client and as such would be better suited as example for content rather than session data?!

dat-ecosystem-archive / DEPs

Amending reservations against DEP 0006 #49