microsoft / vscode

Visual Studio Code
https://code.visualstudio.com
MIT License
162.2k stars 28.55k forks source link

[Request] [Notebooks] Allow arbitrary values in a cell's language selector. #142133

Open brettfo opened 2 years ago

brettfo commented 2 years ago

.NET Interactive notebooks allows different cell languages via the language selector in the bottom right of the cell. Currently the only allowable values are languages that have been pre-registered with VS Code via the "contributes"."languages" field of the manifest.

A common scenario we're seeing is the user creating multiple connections to different SQL databases, e.g., AdventureWorks and Northwind. Currently we handle this awkwardly by pre-registering a language dotnet-interactive.sql with an alias of "SQL (.NET Interactive)", but that doesn't actually let the user execute a query against a specific database connection. They have to put a magic command at the top of the cell that we use to route the request.

For example, one cell might have the language set to "SQL (.NET Interactive)", but the contents of the cell look like this:

#!sql-AdventureWorks
SELECT * FROM ...

and the next cell also has it's language set as "SQL (.NET Interactive)", but its contents are:

#!sql-Northwind
SELECT * FROM ...

It would be much more instructive to the user if they could simply select from the dropdown "C# (.NET Interactive)", "F# (.NET Interactive)", "AdventureWorks", "Northwind", etc., where the last two items we dynamically added at runtime.

brettfo commented 2 years ago

As I've discussed this idea with my team we've narrowed the ask a bit.

A notebook controller has the supportedLanguages property of type string[] | undefined. Those string values are then matched against languages registered with VS Code and at runtime the dropdown displays the "friendly" names. E.g., setting that property to ['dotnet-interactive.csharp', 'dotnet-interactive.fsharp'] will result in the values "C# (.NET Interactive)" and "F# (.NET Interactive)" being visible in the dropdown.

The Ask

We propose that the supportedLanguages property type be expanded to string[] | { languageId: string, displayName?: string }[] | undefined where the displayName? property, if present, will be used at runtime and if not present then the default value of the registered language's "friendly" name will be used.

The Result

In a .NET Interactive Notebook with multiple SQL connections, the user dropdown will display the name of the database connection.

Example

notebookController.supportedLanguages = [
    { languageId: 'dotnet-interactive.csharp' },
    { languageId: 'dotnet-interactive.fsharp' },
    // ...all the other values normally present
    { languageId: 'dotnet-interactive.sql', displayName: 'AdventureWorks (SQL Server 2018 - localhost)' },
    { languageId: 'dotnet-interactive.sql', displayName: 'Northwind (SQL Server 2008 - Azure)' },
];

N.b., the last 2 entries both list dotnet-interactive.sql as the language id, and that is so the proper TextMate grammar can be selected, but the displayName? property is different so the user can differentiate between them.

image

At this point the .NET Interactive extension can get the cell's selected language identifier, e.g., something like this:

// this API should come from VS Code
getCellLanguageValue(cell: NotebookCell): string | { languageId: string, displayName?: string } {
    ...
}

// this code is in the .NET Interactive extension
const cellLanguage = getCellLanguageValue(theCell);
if (typeof cellLanguage === 'string') {
    // we do our own mapping from `dotnet-interactive.csharp` to `C#`, etc.
} else if (typeof cellLanguage.languageId === 'string' && typeof cellLanguage.displayName === 'string') {
    // now we map against the cell's reported language as well as its display name
} ... // all other scenarios
jrieken commented 2 years ago

Currently we handle this awkwardly by pre-registering a language dotnet-interactive.sql with an alias of "SQL (.NET Interactive)", but that doesn't actually let the user execute a query against a specific database connection.

Can you explain what is telling these languages apart? I am asking from a language definition POV: do they have a different grammar, different bracket rules, or different comment tokens? From looking at your mockup/sample it seems more like some kind of backup/server connection info and that would make me wonder if language is the right thing to extend

brettfo commented 2 years ago

Can you explain what is telling these languages apart?

From a language perspective, the thing we call "SQL - AdventureWorks" is the same language as "SQL - Northwind", meaning the same grammar, comment tokens, etc., but we need to know which internal connection object to execute the query against, and the dropdown in the bottom corner of the cell seems like the perfect place to let the user make that selection.

In a similar situation, we currently have one C# language/kernel, but we'd like to give the user the option to start an experimental C# 11 kernel running against .NET 7. From the language perspective, it's still C#, but our notebook controller needs to be able to know where to dispatch that request, so giving the user both "C# (.NET Interactive)" (current) and "C# Preview - .NET 7 (.NET Interactive)" is how they would make that distinction.

jonsequitur commented 2 years ago

Another couple of examples:

roblourens commented 2 years ago

I think we will be discussing this tomorrow morning, but I'll give you my initial reaction - it seems like you are mixing two concepts- language and where the code will execute. Any reason you can't leave the language picker alone but use a NotebookCellStatusBarItemProvider to provide another item right next to it that shows the target options available based on the cell's language? It seems like these two concepts are independent and we shouldn't be making fake languages for the matrix of combinations.

jonsequitur commented 2 years ago

The concepts are separate but we haven't encountered a case where making them two separate user gestures adds value. This is because in every example we've seen (several of which are described above), the language is inseparable from where the code will execute.

Let's take SQL as an example. If I choose SQL first, I still need to choose a database to connect to before I can see schema-specific completions or run a query. Since the database speaks a specific language (maybe PostgreSQL, maybe T-SQL), choosing the language is implicit in choosing the connection. Choosing SQL in isolation is not useful. It's just an extra, redundant click for the user.

Looked at differently, what value does choosing the language in isolation provide? If I choose C#, I can't send it to my JavaScript, Python, SQL, Mermaid, or HTML kernels and get any result other than a syntax error.

roblourens commented 2 years ago

Language is the most fundamental characteristic of a code cell, and in my view, it needs to stand alone in a vscode notebook. A cell with python code is simply a piece of python code, regardless of how it will be executed. A .js file has to be a .js file, even though we could give a better experience by forcing the user to tell us whether it will be run in node or a browser. Sure, you have to connect to a database to execute your sql query, but I should be allowed to type SQL code without connecting to a database first. I should be able to create cells even when the wifi is broken 😁

But if you want a faster experience when creating cells, it sounds like you are trying to create a cell with a sort of template. What if you could have a third button here that brings up a custom picker with the matrix of language * runtime?

image

You can even add a button to the cell statusbar to bring up such a picker, and populate the language and runtime fields.

I heard we were going to meet in person but I don't have anything on my calendar