microsoft / vscode

Visual Studio Code
https://code.visualstudio.com
MIT License
163.05k stars 28.8k forks source link

Explore cell level commenting in notebooks #144850

Open rebornix opened 2 years ago

rebornix commented 2 years ago

We will explore how to bring the commenting experience from the monaco editor to the notebook land.


rebornix commented 2 years ago

Current prototype:

https://user-images.githubusercontent.com/876920/159367172-9c1fd5fe-cf30-4e33-8056-9942c92e7210.mov

rehmsen commented 4 months ago

Hi,

thanks for building basic comment support into the notebook editor with #145090.

I found it a bit tricky to work with that it expect comments to be attached to the cell URI, not the file URI. The issue with that is that the cell URI is dynamically created during deserializing (from the file URI changing the scheme and adding a cell fragment which is basically an encoded increasing number), but VSCode also has a comments panel, where the comment should ideally be shown even before you open the file. At that point the cell URI is unknown.

I was wondering how you deal with that, and tried to get Github Pull Request comments to show up in the NotebookEditor for ipynb files, which does not seem to work: They are only shown when I reopen the ipynb file in a TextEditor.

I have a workaround, which starts creating the comments with the file URI, and the line number in the raw JSON ipynb file, and then upon opening the notebook editor, recreates the CommentThread with the cell URI (and cell-relative line number) as it becomes known. I need to recreate because the URI of a CommentThread is actually readonly - this might cause all kinds of issues with keeping view state, like whether comments are collapsed and any non-saved replies.

Instead, I wonder if we could not leave CommentThreads on the file URI, and when we fetch the comments for a notebook in getNotebookComments, derive the appropriate cell id and cell-line-number from the raw file URI and line number. This would require some sort of source map between the raw file numbers and the structure of the notebook, and as VSCode hands off deserialization to extensible NotebookSerializers, that API would also need to allow providing a source map.

As an example, NotebookData could get a new optional field

interface NotebookData {
  /* Maps a line number in a raw notebook file to a (cell, cell line number) pair in the deserialized notebook model. */
  lineMap?: Map<number, {cellId: string, cellLine: number}>;
}

If a NotebookSerializer fills that, then it is used to map any file URI comments for notebook files to the corresponding cell locations in the notebook editor. The inverse transformation may be needed when creating a new comment on a cell.

One issue is that I believe right now cells do have a stable ID (beyond this increasing integer which is ephemeral) - maybe that could be added for this purpose (or we would have to make the assumption that we can reference the deserialized cells by their index in the serialized source). Another issue might be comments on cells that were not in the serialized source. We can probably work through these issues.

What do you think about this proposal? Is there some simpler approach that I am overlooking? I would be happy to contribute the changes, but would like to align on the design direction first.

Cheers Ole