dotnet / interactive

.NET Interactive combines the power of .NET with many other languages to create notebooks, REPLs, and embedded coding experiences. Share code, explore data, write, and learn across your apps in ways you couldn't before.
MIT License
2.84k stars 382 forks source link

Auto Language Detection #2872

Open sharoncxu opened 1 year ago

sharoncxu commented 1 year ago

Is your feature request related to a problem? Please describe. Each time you a new cell is created, the language for that cell has to be explicitly stated. By default, it will either be set to the global language for the notebook, or it will be set to the language of the previous cell. However, every time users want to change languages, they will have to go to the language picker to make the selection. Sometimes they might forget to change the language and get an error on cell run before realizing they need to update the language selection.

Describe the solution you'd like In an ideal experience, polyglot notebooks will be able to detect based on syntax what language the user is coding in, and automatically update the language dropdown accordingly. The user will always have the option to overriding, but autodetection will lessen the burden of switching languages.

claudiaregio commented 1 year ago

@TylerLeonhardt you did some work similar to this at the file editor level right? Could it be reapplied at the cell level in notebooks?

TylerLeonhardt commented 1 year ago

Do you see this Auto-Detect option? It should work for you

Image

claudiaregio commented 1 year ago

@TylerLeonhardt yes that shows up there however the polyglot notebooks extension has two controls. The one you selected which differentiates between markdown and code, and the second that actually differentiates between language/environment/data connection.

In the screenshot below, you'll see the "code" option and then the option to its left where the language/environment/data connection is so while autodetection shows up in the list it doesn't quite work yet for polyglot notebooks.

image

@sharoncxu one thing to keep in mind is that the "language picker" we created isn't always a language. It can be an alias for a sql connection, an alias for a kusto connection, or in the future a python kernel, R kernel, python env etc. If a user has multiple sql connections and they start typing SQL code in a new cell, which one would you expect to automatically connect to? Should anything be added to provide more clarity here?

TylerLeonhardt commented 1 year ago

I see. There isn't an API to auto-detect the language, but the implementation does exist here: https://github.com/microsoft/vscode-languagedetection

It supports most of the languages you've listed except for kql & mermaid... and since the model is local, the package is a decent size.

All-in-all, I'm not sure it's going to do well for your scenario... esp since you mentioned that it won't always be just a language.

jonsequitur commented 1 year ago

All-in-all, I'm not sure it's going to do well for your scenario... esp since you mentioned that it won't always be just a language.

In a significant majority of cases we'll be able to infer a kernel name from the detected language, since the notebook file metadata provides a mapping between these and having multiple subkernels for the same language will be less common.

Does the proposal raise other concerns or questions that we should be aware of?

TylerLeonhardt commented 1 year ago

The model that we used for that runs locally... which is great, but, it does take a pretty good amount of code to actually make an accurate guess.

With that said, you can run that project as a CLI so maybe put a few code cells of yours through that to see if the results are good enough.

jonsequitur commented 1 year ago

The CLI approach sounds a bit expensive. Is there an API to run it in-process?

TylerLeonhardt commented 1 year ago

Yeah I suggested the CLI as means for testing locally. The docs of the repo tell you how to use it.