Open dlqqq opened 7 months ago
- Use other file metadata from
os.stat()
to detect OOB changes more accurately. Changes in file size should be the deciding factor in whether a file was changed, not changes in mtime.
I agree that file modification time is not optimal to check if the file was saved by Jupyter or externally. But as I mentioned here, I think the best way to improve the situation would be to use the new hash
field of the contents model.
2. Replace the redundant signaling of OOB changes with the
FileLoader.observe()
signaling API.
Let me explain what you call "redundant signaling of OOB". There is a polling at regular intervals to check for file changes (regardless of whether Jupyter is saving), and a check that happens when Jupyter saves a file. The former is needed because we want users to have their document updated if an OOB happens. But since it's done by polling, we could easily miss an OOB change when saving. That's why the latter is also needed.
When you say "the block here does not call self._on_outofband_change()
", if you look closely the code here is the same as here, so it's equivalent. But I agree that code should not be duplicated.
Happy to review PRs for both points.
@davidbrochart Yes, I agree. Thanks for clarifying! 👍
I won't work on a PR for this just yet. I'll be focusing on opening issues to track our work, as requested by Brian & Sylvain.
Introduction
An out-of-band (abbrev. OOB) change is an edit of a collaborative file not done through JupyterLab's UI. For example, suppose user A runs
git reset --hard <other-branch>
while another user B is editing a collaborative document within the same repository. A'sgit reset
deletes all edits made by B and sets the contents of that document to some fixed value. Then the YStore table containing B's edits needs to be dropped, and this event must be somehow communicated to all Yjs clients connected to the document. Here, we say that an out-of-band change occurred on B's document.Description
The current implementation of OOB change detection results in more data loss than necessary. Fixing this issue will likely reduce instances of user data loss such as the one reported in #219. I've written this issue in two parts below.
Context: how OOB changes are detected currently
OOB change detection is implemented via polling in
FileLoader
. Here are the relevant portions of its source:Here,
self.last_modified
is a value that is set byFileLoader
's' filesystem methods,FileLoader.load_content()
andFileLoader.maybe_save_content()
.FileLoader.maybe_notify()
is repeatedly called to check the latestlast_modified
time (mtime) of the file against its previously recorded mtime. If the latest mtime is greater, then the file was probably modified directly by the server, and an OOB change probably occurred. Each parent is then notified of this OOB change by their callback, previously appended by callingFileLoader.observe()
.In
jupyter_collaboration
,DocumentRoom
is the parent initializing theFileLoader
.DocumentRoom
subscribes toFileLoader
with a callback after it is initialized:Issue part 1: Using mtime alone is unreliable
Using mtime alone to detect OOB changes results in more false positives and false negatives than necessary.
When using mtime alone, false negatives (i.e. OOB change going undetected) occur because:
A file's mtime can have an imprecision of up to a few seconds (depending on the filesystem). Within that interval, the server can modify a file without updating its mtime, resulting in a false negative.
The POSIX standard allows for some system calls like
mmap()
to update the content of a file without updating its mtime.When using mtime alone, false positives (i.e. an OOB change event is emitted without an OOB change) occur because:
mtime
but leaves the content unaffected. If an OOB change is signaled here, then a false positive occurs, and collaboration is halted for no reason.vim
and immediately exits by running:wq
, that would cause a OOB change to be wrongly signaled.Both false positives and false negatives result in data loss of at least one client's intent. If we consider the previous example with A and B, a false positive results in A's
git reset
going ignored, and a false negative results in halting collaboration on B's notebook no reason. In both cases, either A or B's intended content is ignored.False positives are especially dangerous because of how easily they can be triggered and how they completely halt collaboration on a document. I suspect that there are specific common user scenarios that trigger these OOB false positives, which contributes towards a higher frequency of data loss issues for users.
Issue part 2: Redundant way of signaling OOB changes
FileLoader
has a redundant way of signaling OOB changes to its parent object, but only when a file is saved throughFileLoader.maybe_save_content()
:Here, it signals an OOB change having occurred by raising an
OutOfBandChanges
exception. The redundant signaling of an OOB change causesDocumentRoom
to similarly provide a redundant handling of an OOB change:Proposed new implementation
os.stat()
to detect OOB changes more accurately. Changes in file size should be the deciding factor in whether a file was changed, not changes in mtime.FileLoader.observe()
signaling API.