Closed wleroux closed 3 years ago
Internet connection is offline If the client is currently offline, the plugin will wait until the client is online before trying to establish a connection.
Internet connection disconnected If the client managed to establish a connection but the client is no longer online, the client should wait until it is online to attempt to re-establish a connection.
Any outstanding changesets should be kept until the connection is re-established. No new changesets should be accepted while offline. Should the connection be restored, any outstanding changesets should be re-sent to the server.
Backend connection failed to connect If the client is online but the connection to the backend is never established, the client should attempt to connect immediately, 5s, and every 30 seconds thereafter.
Backend connection break If the client is still online but the connection to the backend is list, the client should attempt to retry in immediately, 5s, and every 30s thereafter until the connection is re-established.
Any outstanding changesets should be kept until the connection is re-established. No new changesets should be accepted while disconnected. Should the connection be restored, any outstanding changesets should be re-sent to the server.
Changeset failed to apply If the client receives a changeset and the client’s outstanding changeset can no longer be processed, the client should discard all outstanding changesets.
Hi Wayne,
Thanks for putting this together!
I'd like to propose a few changes so this will work with Issue #3 .
Note: This proposal turned out to be more complex than I first thought so my apologies. I feel like it is better if we get it right the first time though rather than having to rewrite it later.
We don't want to keep session data around indefinitely because we want the app's database to be the source of truth for the document, not the session data in Redis.
If a user can reconnect at any time, we can run across this issue:
We allow the developer to set a timeout on the editing session. This would be set on the plugin and on the collaboration server. The one on the client should be shorter than the time on the server. Let's call it CLIENT_SESSION_TIMEOUT
and SERVER_SESSION_TIMEOUT
where CLIENT_SESSION_TIMEOUT < SERVER_SESSION_TIMEOUT
. For example, we will say it is 4 minutes for CLIENT_SESSION_TIMEOUT
and 5 minutes for SERVER_SESSION_TIMEOUT
.
The purpose of the keep alive is to allow the user to reconnect in the case of a dropped connection for a limited time (maybe by default of 5 minutes).
One other thing we add to the proposal is a third state. Currently, we have ready
and !ready
. I propose we change this to connecting
(which can also mean reconnecting), ready
and no-session
(which can mean session ended, session timed out, or no connection to the Internet). If a user gets a no-session
, the correct course of action is to refresh the browser which can be done by the user or programmatically by the developer to make it seamless.
The general idea is that within the server session timeout, the session is preserved on the collaboration server, and we can confidently connect to to the collaboration server in the client session timeout timeout (which is less than the server session timeout) without having to refresh the document data from the application's database.
In a case where we can't make an initial connection (whether because we are offline or because the backend connection is failing), we are okay to try within the client session timeout. After the client session timeout, we want to indicate that there could have been an editing session by another user that started and then ended; this would mean that our starting document data would be wrong. We indicate this by setting the state to no-session
.
In a case where we were connected but then got disconnected, we try to reconnect. If there is no session existing, we discard data because we don't know if there have been any edits from other users while we were disconnected. If there is an existing session, we can update it with the changesets. If there is an existing session but it's a different session, we also discard the changesets.
With this in mind, let's review the scenarios. Word in italic are the same as the original proposal.:
CLIENT_SESSION_TIMEOUT
(4 minutes) and then we end up in no-session
. If the user connects within the 4 minutes, we are safe because the Redis session data was saved for 5 minutes. No opportunity for stale document data to be introduced.Internet connection disconnected: If the client managed to establish a connection but the client is no longer online, the client should wait until it is online to attempt to re-establish a connection.
Any outstanding changesets should be kept until the connection is re-established. No new changesets should be accepted while offline. Should the connection be restored, any outstanding changesets should be re-sent to the server.
There are two situations where the changesets should be rejected: (1) there is no session data in Redis. In other words, the session timed out by reaching the SERVER_SESSION_TIMEOUT
(2) there is session data but it is from a different session than the one we were previously on.
CLIENT_SESSION_TIMEOUT
(4 minutes) and then we end up in no-session
. If the user connects within the 4 minutes, we are safe because the Redis session data was saved for 5 minutes. No opportunity for stale document data to be introduced.Backend connection break
If the client is still online but the connection to the backend is list, the client should attempt to retry in immediately, 5s, and every 10 (not 30)s thereafter until the connection is re-established.
Any outstanding changesets should be kept until the connection is re-established. No new changesets should be accepted while disconnected. Should the connection be restored, any outstanding changesets should be re-sent to the server.
There are two situations where the changesets should be rejected: (1) there is no session data in Redis. In other words, the session timed out by reaching the SERVER_SESSION_TIMEOUT
(2) there is session data but it is from a different session than the one we were previously on.
Changeset failed to apply
If the client receives a changeset and the client’s outstanding changeset can no longer be processed, the client should discard all outstanding changesets.
If the session on the server exists, resync to the session that exists on the server. If there is no session on the server, set no-session
.
Note: I think in the long run, there should be a way to do this:
I'm trying to see if I can come up with a design that works, but I feel like I need time to think through the scenarios. The above design should work though IMO.
DRAFT:
IGNORE THIS FOR NOW. JUST THINKING OUT LOUD AND KEPT FOR REFERENCE/IMPROVEMENT.
This maybe a crazy idea but might solve the source of truth problem so that the developer can choose:
The basic idea is that it works like this.
When a user connects or reconnects, on the callback to some method which I'll call initSessionState
, we return a Slate document and an array of changesets like { document: { ... }, changeSets: [ ... ] }
. If, when the user connects, there is already a session state, the initSessionState
method doesn't get called and we are good to go. One of the options that gets passed in is staleDuration
which is how long the document has been sitting (i.e. stale) since it got loaded into memory last.
Here's a sample of an initSessionState
:
function initSessionState({params, staleDuration}) {
if (staleDuration > 5 * MINUTES) {
throw new NoSessionError("")
}
return {
document: params.document, // client passes the document through
changeSets: [], // no changesets
}
}
Another thing that could happen is that the client passes in a database id
async function initSessionState({articleId}) {
/* In this scenario, there is no need to test for staleDuration because we grab the data from from db */
const article = await db.table('articles').getById(articleId)
return {
document: article.body,
changeSets: [], // no changesets
}
}
But let's say the the developer wanted to support an offline mode. They can then save the change sets to their database and repopulate it into Redis.
async function initSessionState({articleId}) {
/* In this scenario, we grab the data fresh and we also populate with 24 hours of changesets */
const article = await db.table('articles').getById(articleId)
const changeSets = await db.table('changesets').query({articleId, insertedAt: AFTER_24_HOURS_AGO })
return {
document: article.body,
changeSets: [], // no changesets
}
}
When a session ends, we can return a retention time.
// forgot the actal name of method
async function allSessionsInDocumentEnd({params, documentId, document, changeSets}) {
await db.table('articles').set({document}).where({id: params.articleI})
// this query is not real, but you get the idea. The changeSets may have ids and
// some may already be in db. Just assume this does the right thing.
await db.table('changeSets').upsert(changeSets)
// keep the sesion alive for 5 minutes after the last user leaves
return {
sessionKeepAlive: 60 * 5 // how long to keep session data alive for in Redis
}
}
The general idea here is that the developers can choose how long to keep sessions around in Redis. When getting started, developers can just set a sessionKeepAlive for 5 minutes or so. In the onInitSessionState
the developer can just make sure that the current document isn't more than 5 minutes stale.
Later, the user can choose to save part of the changesets. This would allow for offline as long as the developers wants to keep it. It would also solve the problem of source of truth. It would still be in the developer's database but let's say that they did a programmatic update of a document. They would have a lot of freedom to do what they want. They could:
article.body
and delete the changeSets
for that article.article.body
and insert the corresponding changeSets
manually.It becomes a very flexible and powerful system that can grow with developer's needs. It's also interesting because the developer can prune the changeSets as they desire or keep them forever if they wanted to. And it makes sense not to fill Redis up with all the history from old editing sessions.
This change does not include the changes RE: CLIENT_SESSION_TIMEOUT
and SERVER_SESSION_TIMEOUT
as these are concerns that will be covered in #3.
The changes this includes are the following:
In some situations, if editors get out of sync with each other, Slate Collaborate does not recover gracefully resulting in lost edits.
The robust recovery will allow the editors to resync quickly. In the case of a sync failure, the collaborative editor will sync up to the last state in the collaboration server. Edit loss (if any) should be kept minimal like a few keystrokes.