microsoft / PTVS

Python Tools for Visual Studio
https://aka.ms/PTVS
Apache License 2.0
2.53k stars 673 forks source link

Variables Explorer #81

Closed zooba closed 3 years ago

zooba commented 9 years ago

Provide something like the "Variable explorer" in the Spyder IDE. It essentially is just a list very similar to the say the "Locals" debug window in VS showing all the variables that one defined in the interactive window. In Matlab the equivalent window is called "Workspace". Having this would make the whole IDE very, very competitive for numerical work.

A simple first version could just have the same features as the "Locals" window. A more advanced version could have nicer features for e.g. numpy arrays, like table editors, plotting etc.

(Migrated from https://pytools.codeplex.com/workitem/1900)

den-run-ai commented 9 years ago

Is this for debugger or REPL?

zooba commented 9 years ago

Good question. Probably REPL initially, and then it's pretty easy to hook it up to the debug REPL which will cover debugging (could actually be totally trivial to hook it up if it's done right).

houz42 commented 9 years ago

Then PTVS would be the best choice for scientific computing python environment.

So... When could we see this?...

zooba commented 9 years ago

Some implementation notes I came up with while thinking about this (note: not real code):

Add Variable Window for "Watch"-like functionality when not debugging. Add Data Visualizers for pop-out value watch (modeless debug visualizers)

// a MEF contract
interface IVariableSessionProvider {
    IEnumerable<IVariableSession> GetSessions();
    event OnNewSession;
}

interface IVariableSession {
    string SessionName;
    Task<IVariableCollection> GetVariablesAsync();
    Task<IVariable> GetExpressionAsync(string);
    event OnVariablesChanged; // (includes specific change details)
    event OnEndSession;
}

interface IVariable {
    string DisplayType;
    string Expression; // - only settable if obtained from GetExpression()
    bool IsReadOnly;
    Task SetValueAsync(string valueExpression);
    Task<IVariableCollection> GetChildrenAsync();
    Task<string> ToPlainTextAsync(int maxLength=-1);
    event OnValueChanged;
    event OnVariableDestroyed;
}

interface IVariableCollection {
    Task<IVariable> GetAsync(int index);
    Task<ICollection<IVariable>> GetManyAsync(int firstIndex, int count);
    int Count;
}

// a MEF contract
interface IVariableViewProvider {
    string DisplayName;
    int Priority;
    Task<bool> IsApplicableAsync(IVariable);
    Task<UIElement> GetUIElementAsync(IVariable, Size bounds);
}

A session provider may be something like:

A session may be something like:

IVariable instances are displayed in the variable window as small, non-interactive views, initially clamped to one line of text high, with an "expand" button for extra height.

The variable collection is expected to be virtualized, and will only be enumerated far enough to satisfy the list. However, its Count property needs to be valid. (Same applies for GetChildrenAsync.)

The Variable Window allows values to be popped out into Data Visualizers - probably always via GetExpression. IVariable instances are displayed in Data Visualizers as resizable, interactive views, and expressions can be edited. When the variable is no longer valid (because the session is closed, or whatever conditions the session determines), OnVariableDestroyed is raised and the visualizer is invalidated.

IVariable.SetValueAsync is used when the user updates the entire expression, for example, through the Variables Window. Changes made via an interactive aspect in a Data Visualizer should be handled by the visualizer implementation and not through this method.

When OnValueChanged is raised, it implies that view providers needs to be reevaluated and GetUIElement called again. Interactions handled by the UIElement do not need to re-raise the event.

zooba commented 9 years ago

We may want to include a shared interface like this, so that views have some common infrastructure for types they don't already know about.:

interface IVariableData : IVariable {
    string ContentType;
    Encoding ContentEncoding;
    Task<byte[]> GetDataAsync();
}
int19h commented 9 years ago

For GetDataAsync, I would prefer that it returned a Stream instead. Reasons:

int19h commented 9 years ago

Here's another interesting aspect. For plots, we've discussed the need to have a switch that would make the plot visualizer "detached" from the mutable state, such that it no longer reflects any changes - I'm going to refer to this as "snapshotting" from here on. Now, since we've also decided to use the same universal visualizer toolwindow for plots and grids (and other stuff like images), it makes sense that the snapshot functionality is a top-level feature in that visualizer, probably a toolbar button or some such.

Now the question becomes: how do we implement snapshotting for visualizers that lazily load data? In particular, grid is in that position. If the user decides to snapshot it, it somehow needs to acquire the snapshot of the data backing it; otherwise, when the original context goes away, scrolling stops working.

This can be done on a case-by-case basis with different types of views. E.g. if we have a view interface for 2D data, we could also have an implementation of it that takes another instance and snapshots it. But it feels like this would have to be manually repeated whenever we need it, even though the functionality really is common.

Alternatively, we can change the interface for variables to allow creation of snapshots. Basically, a snapshot of a variable is another variable that guarantees that the data backing it will not suddenly go away (the precise definition of "suddenly" here is something that's dialable - e.g. if we say that it can go away at the end of debug session or at REPL reset, the implementation can just make a copy inside of debuggee, instead of transmitting all that data to debugger over the wire). The tricky part here is specifying the semantics of snapshotting with respect to other objects, i.e. GetChildrenAsync. This would either be disabled, which would complicate visualizer code since it can't treat snapshot the same; or else GetSnapshot would have to allow the caller to specify whether and which children are included.

zooba commented 9 years ago

Agreed on GetDataAsync.

For snapshotting, I think that's going on behind these interfaces through some agreement between the visualizer and the variable session. Since that's where the lazy data acquisition is implemented, it really needs to be the place to make that eager and retained in memory. I don't think it's actually common to all visualizers (if the debuggee terminates, how do we use matplotlib to regenerate the plot from raw data? We can't - it needs to generate a full plot before terminating, whereas the table needs to acquire all the data before terminating).

What we may want to provide here though is some way to flow that setting into custom visualizers. Maybe this could also be an attached property that is available to the UIElement after it's put into the parent control? We'll probably also need to provide commands (export/reset/etc.) that bubble up via the custom UI so that they can be overridden by custom UI, so maybe this is a "get all state and detach" command. That way we can also prevent the VariableDestroyed event from disabling the UI (but I'm thinking we may just disable expression modification on that and let the UI disable itself if it can't remain interactive without the actual variable existing).

alexranaldi commented 9 years ago

I was just going to file an issue for this, but I found this one. Having the "Autos" window and "Call stack" window work outside of debugging (for example, with a given interactive console) is critical IMHO.

int19h commented 9 years ago

One thing of note. The equivalent in RStudio has grouping according to the (broadly defined) type of variable, separating them into values and functions:

image

This might be handy to have for Python as well, especially in global/REPL scope, where it's common to have a lot of variables. Generally speaking, function-typed variables are not as interesting, as well as types and imported modules, and segregating them like that (and putting "Values" on top) would emphasize that. Right now a typical view of Locals in the global scope of script after it had imported a bunch of modules and defined some functions is not exactly pretty; and REPL after $attach is even less so.

Speaking more generically, we can define the notion of category for variables (e.g. values, types, functions and modules for Python), and add UI to Variable Explorer to enable grouping by it. Other languages can then define it in ways more appropriate to them.

kuzeylundvall commented 7 years ago

Hello PVTS Team, what is the current project status regarding a Variable Explorer in PTVS (for example numpy, pandas objects). Have you made a decision to include that in VS2017? -> It's currently the show-stopper to switch from Spyder IDE (Python) or RTVS (R)

Thanks!

zooba commented 7 years ago

We haven't committed anything yet for VS 2017 - thanks to the more frequent update cycle, we're able to be much more flexible than in the past. But this is a very popular request, so it is always under consideration.

When you see someone assigned to this issue, you'll know we're actively working on it.

MichaelXt commented 7 years ago

that is blocker for me as well using spyder for now.

Starkiller4011 commented 7 years ago

A variable explorer with data table viewer and plot viewer similar to Spyder would make PTVS the number one environment for data science. Only been using it for a couple days now, the lack of a variable explorer keeps pushing me back to Spyder and R-Studio, and I love how clean it is as well as it's linting features. Not to mention the number of languages supported with syntax highlighting and linting is amazing for a single IDE.

GBelzoni commented 7 years ago

A would also rate a variable explorer/ data.frame viewer as a really important feature. I would love to use VS as my primary python/data science tool, but having to use the console to view dataframes can be a real pain

johndpope commented 7 years ago

I'm thinking that the fastest way to get this up and running is to apply Cunningham's Law - https://meta.wikimedia.org/wiki/Cunningham%27s_Law that is - consider slapping some incorrect code together and commit it to master.

syagev commented 7 years ago

A numpy array visualizer is a must for every computer-vision / ML scientist (something like ImageWatch)... not necesseraliy in the context of REPL but also during standard debug.

Is this included in the scope of this issue? If not, I'm happy to open a separate issue and give it a try implementing it if some starting pointers can be given.

zooba commented 7 years ago

@syagev We consider that covered by #747.

If you have any recommendations for third-party libraries we could trigger (e.g. like the pop-out matplotlib window) that is something we'd happily support sooner. "Send to Excel" (via CSV) is another possibility that wouldn't take as long to get together. Unfortunately, a fully featured data grid viewer is beyond our resources right now given the other work we've taken on.

(But please, take the conversation to the other bug - I shouldn't have started discussing it here :) )

Sinansi commented 6 years ago

Hello Zooba, Can you please provide any update on this issue? Are you working on adding Variable Explorer to PTVS? PTVS is my favourite python IDE, PyCharm is not comfortable to my eyes, and Spyder dont have a real dark theme. Unfortunately, I am forced to use other IDEs because you lack this critical feature, what makes it so hard to add? Thank you!

zooba commented 6 years ago

Thanks for the vote! We've been working on other changes recently and haven't had enough time to work on this. If you (or if you know someone who) can help, feel free to email resumes to pythonjobs@microsoft.com and we can talk about getting more people onto our team :)

johndpope commented 6 years ago

So - @zooba - you should hire @agermanidis - https://github.com/agermanidis/livepython - he made this piece of art on top of vscode. screen shot 2018-06-21 at 11 54 05 am

dstanner commented 5 years ago

+1 on this feature request. I would love to switch to VS Code for all of my data science work, but the lack of variable/data explorer capabilities keeps me from using it.

Sinansi commented 5 years ago

I dont think Microsoft care about adding this feature. Focusing on R mainly, it seems they are still 10 years behind, thinking that R is still the major data analysis programming language. On the other hand, they might be afraid to lose their investment on R if they promoted Python. But that also means they will have to go against the market, since everybody is moving to Python. I couldnt wait any longer, so I just switched to PyCharm.

Sinansi commented 4 years ago

I quit Python and switched to Julia and PTVS still dont have variable explorer :s