eclipse-theia / theia

Eclipse Theia is a cloud & desktop IDE framework implemented in TypeScript.
http://theia-ide.org
Eclipse Public License 2.0
19.87k stars 2.48k forks source link

Optimize file event handling #1269

Open hexa00 opened 6 years ago

hexa00 commented 6 years ago

So that:

See also discussions in #950 and #1188

hexa00 commented 6 years ago

Current design

Currently the FileSystemWatcherServer provides this api:

   export interface FileSystemWatcherServer extends JsonRpcServer<FileSystemWatcherClient> {
    /**
     * Start file watching for the given param.
     * Resolve when watching is started.
     * Return a watcher id.
     */
    watchFileChanges(uri: string, options?: WatchOptions): Promise<number>;

    /**
     * Stop file watching for the given id.
     * Resolve when watching is stopped.
     */
    unwatchFileChanges(watcher: number): Promise<void>;
}

And has a client to receive notification with this api:

export interface FileSystemWatcherClient {
    /**
     * Notify when files under watched uris are changed.
     */
    onDidFilesChanged(event: DidFilesChangedParams): void;
}

Implementations details:

Number Of Processes:

And it is bound like so: bind(FileSystemWatcherServer).to(ReconnectingFileSystemWatcherServer);

This makes it so that anyone injecting a FileSystemWatcherServer creates a new Nsfw watcher process with its own set of watches and its own client.

Aggregation of events:

To avoid too many events to be generated events are aggregated over a 50ms period.

Watched files notifications:

These are shared across the same server, meaning that if a client on the same server does 2 different calls to watchFileChanges with a different set of files the client will be notified of all the changes. Forcing the client to check if events are relevant.

Ignore rules:

The watcher id is only used to apply ignore rules so that files are ignored by watcher. So if 2 watchers watch the same directory if one ignores .git and the other doesn't the watcher client will still get notified since the notifications are shared.

Problems with this approach:

New design proposal:

The Watcher interface:

 interface Watcher implements Disposable {
   /* Notify of file changes */
   onChanged: Event<FileChange>
   /* Pause event generation, events will be aggregated meanwhile.  */
   pause();
   /* Resume event generation.  */
   resume();
   /* Stop watches the files */
   dispose();
}

pause(): Pause will stop the watcher from firing events, and instead aggregate the events.

Pause is meant to be used as semaphore such that multiple pause calls can be nested.

resume(): Resumes the events if the semaphore reaches 0 count.

Example usage:

Simple case:

const watcher = new Watcher(uri);
watcher.onChanged(changes => {});
watcher.dispose();

Pausing case:

function saveFile() {
this.watcher.pause()
await fs.save();
this.watcher.resume();
}

Nested pausing case:

function saveMultiFiles() {
this.watcher.pause()
saveFile();
saveFile();
saveFile();
this.watcher.resume();
}

The WatcherImpl:

The implementation of a watcher controls calls to the FileSystemWatcherServer.

class WatcherImpl implements Watcher, Disposable() {

protected readonly id;
@inject(FileSystemWatcherServer) protected readonly server;

constructor(uri: string, options? WatchOptions) {
  this.id = this.server.watchFileChanges(uri,options);
}

dispose() {
  this.server.unwatchFileChanges(this.id);
}
}

The FileSystemWatcherServer:

This is to be refactored such that there is only 1 FileSystemWatcherServer.

And that the interface returns a watcher id to be used by a Watcher interface implementation.

So while the interfaces stays the same:

    watchFileChanges(uri: string, options?: WatchOptions): Promise<number>;
    unwatchFileChanges(watcher: number): Promise<void>;

The implementation would change such that events are sent by watcher so one way would be like:


protected fireDidFilesChanged(watcherId): void {
  if (this.watchers[id].pauseCount === 0) {
   this.watchers[id].onChanged(event);
 } else
 {
  this.watchers[id].events.push(...this.changes);
  }
}

protected pause(watcherId) {
  this.watchers[id].pauseCount++
}

protected resume(watcherId) {
  this.watchers[id].pauseCount--
  if (this.watchers[id].pauseCount === 0 && this.watchers[id].events.length > 0) {
   this.fireDidFileChanges(watcherId);
}   
}

Refactoring considerations

At the moment assumptions are made in the code that the FileSystemWatcher watches the workspace, these need to be refactored such that the WorkspaceService exposes a watcher and that clients register their callbacks on that.

Conclusion

I think with a design like this we will be able to have 1 nsfw process, manage large event generations while doing a file system operation efficiently and provide a clear API.

Comments are welcome.

akosyakov commented 6 years ago

Is there guarantee that calling save will cause fs events immediately? I don't think so. The actual case is that events come later after save already was completed and they won't be buffered between pause and resume.

marcdumais-work commented 6 years ago

@akosyakov Do you mean that even when node seems to be done with fs.writeFile() it's possible that the physical write has not occurred yet (e.g. because of filesystem buffering)?

akosyakov commented 6 years ago

Node is single threaded and there is a single task queue processing requests from different clients as well as notifications from nfsw watching process. So processing of nfsw notifications will be delayed by processing other requests.

You can try to make a change and then look at ws frames when setContent was finished in filesystem socket and when onDidChangeContent arrived for the same file in fs-watcher. For me results are following:

The first event arrives 100ms after save was finished.

Probably there could be other sources of delays depending on the implementation of nfsw and fs module. I will need to read up on it.

akosyakov commented 6 years ago

btw there are so much spam in fs-watcher socket because of .git/index.lock

hexa00 commented 6 years ago

Node is single threaded and there is a single task queue processing requests from different clients as well as notifications from nfsw watching process. So processing of nfsw notifications will be delayed by processing other requests.

In this case it's a bit different nice nsfw runs in an external process, thus there are 2 event loops.

Here's how I see it:

BackendEvent Loop

Nsfw Event Loop

We do incur a penalty since we need to do await pause() and await resume() however. But events should be aggregated properly.

We would need to dig a bit deeper and test to be sure however.

I think the rest of the design stands on it's own and thus by implementing it it should be easy to test the pause/resume feature. Or keep a deboucer... or maybe use both.