Explore notebook trust + workspace trust

roblourens commented 3 years ago

Continuing the discussion from https://github.com/microsoft/vscode/issues/106747

DonJayamanne commented 3 years ago

Let us know if there's anything you need. Thanks

roblourens commented 3 years ago

First, what are we trying to do by observing workspace trust?

It sounds like in vscode's workspace trust model, if the user starts a debug session, that is an explicit action by the user, and we will not make them trust the workspace to do this. Is that right @sbatten? If so it seems like executing a notebook/cell is essentially the same, so do we have any reason to require the workspace to be trusted to do this?
Do we care about renderers executing code in outputs, when they are sandboxed in the webview?
- To be clear we are differentiating between "extension trust" and "workspace trust", only talking about "workspace trust" here. So I am not thinking about evil or compromised renderers. But if a renderer loads code from your workspace and executes it in order to render, then that extension is responsible for checking workspace trust, just as the TS extension is responsible for checking workspace trust before executing code out of node_modules/typescript/....
- The question is, outside of that case, is there any reason for vscode to prevent a renderer from running in an untrusted workspace where it is loading data (not code) from a cell output, processing it, and rendering it in the webview?
- It's possible that some carefully constructed output data could trigger a vulnerability in a renderer and make it do bad things. Is that in our threat model?

sbatten commented 3 years ago

There are currently 2 thought models on workspace trust and explicit actions.

Explicit actions are out of scope. We only talk about browsing a repository and any explicit action is known to the user, so we don't request trust. (i.e. we allow it and do nothing)
Explicit actions are signals of workspace trust and we take this as a signal to trust the workspace. In this model we still must confirm the user wants to mark the workspace as trusted and is only presented with a limited set of options to continue or cancel.

At present, one is not chosen over the other, but hopefully this scopes the discussion.

jrieken commented 3 years ago

The question is, outside of that case, is there any reason for vscode to prevent a renderer from running in an untrusted workspace where it is loading data (

💯 I share this. Event tho that data can be JavaScript/Html source code it is still running in a sandbox (webview or iframe) and that is isolated from the renderer and has no "dangerous", like NodeJS, apis. To me that makes it similar to the embedded browser that we are building and I don't believe we wrap that into a trust model

roblourens commented 3 years ago

Renderer code can send messages outside of the webview which gives it a potential real impact, but still it seems like the responsibility is on the renderer - if it's going to send a message which could result in a bad impact based on data in the output, it should check workspace trust.

jrieken commented 3 years ago

Renderer code can send messages outside of the webview which gives

Ideally those messages go straight to the extension host and aren't rerouted via the renderer. I believe there was some exploration to use message channels that enables that.

sbatten commented 3 years ago

@lszomoru for awareness

roblourens commented 3 years ago

Talked to @sbatten, the current plan is that for explicit user actions like debugging or running a command that requires trust, a dialog will show asking the user to "continue" and trust the workspace, or "cancel" and not trust it. For notebooks, we can hook into the same system to show a dialog when you first run a cell.

You could argue that the extension should be involved in this rather than vscode enforcing it, because trust doesn't matter for some notebooks. For example, I don't care about running a github issues query from an untrusted workspace. But most notebooks are not like that, and this is so similar to debugging, and it will be much simpler to have vscode handle it in the same way.

Then also, given the discussion above, I think that we should not automatically block outputs/renderers from loading in an untrusted workspace. The extension can require workspace trust if it is necessary for that extension (like if it is loading and executing code in the workspace) and it's the responsibility of the extension to determine that. cc @connor4312 and @DonJayamanne do you think that's acceptable?

DonJayamanne commented 3 years ago

@roblourens Thanks this makes a lot of sense, and from what I can tell, there's no need to implement a custom trust model in Jupyter extension.

If user has a trusted workspace, then we don't prompt users to trust the notebook
If user does not have a trusted workspace, and they open a notebook
- Jupyter extension to decide whether we render the output or not
- If we don't display the output, then we can optionally notify the user that they can trust the worksapce if they want to see the output
If user does not have a trusted workspace, and they attempt to run a notebook
- Jupyter extension can ask the user to trust the workspace, if they trust it, we let them run the notebook, else nothing happens (cell does not run)

I guess the only question for Jupyter extension is whether we display the output or not (when not trusted).

@claudiaregio @greazer /cc

roblourens commented 3 years ago

Jupyter extension can ask the user to trust the workspace, if they trust it, we let them run the notebook, else nothing happens (cell does not run)

VS Code will handle this part of it, the extension doesn't have to do it.

Jupyter extension to decide whether we render the output or not

Yes, and we can discuss this more. I'm not sure there is really any need to do anything with output renderers but it probably needs more thinking.

connor4312 commented 3 years ago

👍 for allowing renderers through. We just need to be aware that any of the FromWebview messages may be coming from a untrusted (or, malicious) context. Looking at what we have today, the worst an extension could do it mess up layout/focus and show a 'save' dialog. Arguably the latter could be dangerous if users blindly allow saving, but I don't think there's a way around that.

jrieken commented 3 years ago

You could argue that the extension should be involved in this rather than vscode enforcing it, because trust doesn't matter for some notebooks. For example, I don't care about running a github issues query from an untrusted workspace. But most notebooks are not like that, and this is so similar to debugging, and it will be much simpler to have vscode handle it in the same way.

Sounds like a kernel property to me, like a kernel can say if it requires trust

claudiaregio commented 3 years ago

@roblourens Hi Rob, right now the Jupyter extension prompts the user to trust a notebook if not created on the user's machine (ex. they download it). What happens in this scenario where a user opens a stand-alone notebook where it did not come from a trusted workspace? (ex. opened from downloads folder)

roblourens commented 3 years ago

@sbatten how does trust work in a window that doesn't have a folder open? Or for files that are not in the current workspace?

sbatten commented 3 years ago

Current implementation is that empty workspaces (no folder) are trusted by default. However, opening files in an empty workspace will be treated as the "loose file" scenario. If the loose file is not from a known trusted path, we prompt the user once to inform them that they are bringing untrusted content into a trusted workspace. From then on, any additional untrusted content will not receive a prompt for the remainder of the session.

roblourens commented 3 years ago

Command links in outputs and markdown cells could pose a threat in untrusted workspaces. We implemented this in https://github.com/microsoft/vscode/issues/98100. Is the Jupyter extension still relying on these in outputs and markdown cells?

We'd like to think about removing them since they are a vscode-only feature. Or, we have to either

disable command links in untrusted workspaces
or disable scripts in outputs in untrusted workspaces
or just disable all rich outputs in untrusted workspaces

DonJayamanne commented 3 years ago

Jupyter extension still relying on these in outputs and markdown cells?

Yes.

or just disable all rich outputs in untrusted workspaces

I'd think this is preferred.

or disable scripts in outputs in untrusted workspaces

If we can have this, that's even better, but can we guarantee that this will not be open to exploits in some form? At the end of the day the requirement is to protect the user from malicious content running without the users consent.

roblourens commented 3 years ago

We decided to hide rich outputs in an untrusted workspace, and show some placeholder output
There has been some conversation about asking a debug adapter whether it requires trust, maybe I would do the same for notebooks, although I don't really like it https://github.com/microsoft/vscode-internalbacklog/issues/2049
On Friday I said that there wouldn't be a popup prompt if you try to run a cell in an untrusted workspace. Actually that is not true, there is a prompt to trust the workspace at that moment, and I have implemented that. FYI @claudiaregio

RandomFractals commented 3 years ago

We decided to hide rich outputs in an untrusted workspace, and show some placeholder output

What would be shown in this scenario for a simple webpage GET in a REST Book? https://github.com/tanhakabir/rest-book/issues/90

I think it would be easier to explore this with concrete custom notebook extensions and renderers, perhaps kernels too, and not just Jupyter notebooks.

roblourens commented 3 years ago

They wouldn't be shown, but you wouldn't be able to run the cell anyway in an untrusted workspace.

kaimatrix19821 commented 3 years ago

@RandomFractals You probably don't know this, hence the reason for your response. The author or restbook is a PM in VS Code team . They are using concrete custom notebook extensions and renderers, perhaps kernels too,. they are working closely with a number of external teams extension authors. stop making assumptions and if downvoting someone make some suggestions instead, be constructive

RandomFractals commented 3 years ago

@kaimatrix19821 I know that the author of rest book extension is a PM who works with vscode team. Why I picked it to beta test custom notebooks api, renderers and UX. I think you are the one who is making a lot of assumptions there.

I downvoted that response because I don't think it's a good solution to just not show the notebook or disallow to run code cells, especially simple GET queries in that specific case for untrusted workspaces.

I am also aware that vscode dev team is working with other teams at Microsoft on that native notebooks api and UX.

The feedback I provide here and in other linked tickets for this feature is constructive and includes screens and description of disconnects I am seeing so far, or at least what I would expect to see there as extension developer who is planning to use that notebooks api to create other custom notebook extensions when it is released.

Perhaps, you should not comment with anon. github handle and look up my work and why I am interested in this.

roblourens commented 3 years ago

@RandomFractals Can you tell me why you would open a workspace, explicitly tell us that you don't trust the content/author of this workspace, but then want to make a request to a web endpoint (specified by code that you don't trust) and render the result?

I want to understand more about who would be working in an untrusted workspace and what they would expect to work, because in my mind, I would expect a very basic safe mode.

RandomFractals commented 3 years ago

I would rather see a notebook view with code cells, and have a warning after I review a cell code and decide to run it.

I also think your new trusted workspaces concept with the current implementation would give users a false sense of security. I don't trust any code but my own, sometimes not even that. I believe most seasoned IT professional don't trust any code by default. However, after a review they might decide to run it.

Security researchers might want to load notebooks in untrusted workspaces and pick and choose what to run, which I think is a bit better than not being able to see or run any code cell and being forced into marking a workspace or a notebook as trusted to evaluate how different parts of the notebook run and see the results they produce. Very real world scenario.

& if you are just looking to add on/off basic safe mode to vscode as you mention, then just call it that.

To be honest, I'd rather see you add permissions and extensions, notebooks, renderers, and projects request and get granted permissions to access different parts of the vscode apis, interact with workspace fs, cam, etc. Then from extension authoring standpoint it would be transparent what we need access to, and what would run in extensions or notebooks, even for untrusted workspaces, and let users decide what to run with some warnings. I believe that is a more granular approach and supported by many systems out there today.

I hope that helps. Thanks!

roblourens commented 3 years ago

I would rather see a notebook view with code cells, and have a warning after I review a cell code and decide to run it.

That's exactly what the plan is, and is already implemented. Maybe this thread doesn't have full context.

Security researchers might want to load notebooks in untrusted workspaces and pick and choose what to run

Cells do not run unless you click run. But really in this case of working with known-malicious code, I think you want to work in a VM or another real sandbox, rather than rely on this feature as your only line of defense.

To be honest, I'd rather see you add permissions and extensions, notebooks, renderers, and projects request and get granted permissions to access different parts of the vscode apis, interact with workspace fs, cam, etc.

Fair request, we could work in that direction, the workspace trust feature is only focused on this one aspect of secure development.

RandomFractals commented 3 years ago

@roblourens thank you for providing more context info on this. I'll check it out with this feature setting you shared in another ticket and rest book extension to see how it works. I think some screens or walk through gif as a quick demo of a feature like this would really help to visualize the workflow you are implementing next time.

rzhao271 commented 3 years ago

@roblourens looks like there's a lot going on here. Can you write some verification steps, please?

roblourens commented 3 years ago

Open an untrusted workspace
Open a python notebook
You should be able to see the notebook cells, but you should not be able to execute a cell
If you try to execute a cell, you should get the prompt to trust the workspace
Open a python notebook that has outputs saved
You should not be able to see rich html outputs, but it should fall back to simple outputs (markdown, plain text, etc) if they exist

rzhao271 commented 3 years ago

I've verified almost all of the points above except for the last one. I'm not familiar with how rich html output works, and how one writes a notebook to produce that output, though, so I'm unable to tell when the output switches from a rich html one to a simple one.

RandomFractals commented 3 years ago

@rzhao271 you can try it with a restbook ... https://github.com/tanhakabir/rest-book/issues/90#issuecomment-828863431

RandomFractals commented 3 years ago

@roblourens why should not I see previously generated rich html output in a loaded notebook? I think you are taking this 'untrusted' workspaces a bit too far.

If I had a renderer that generated a chart as a cell output and saved it. Are you saying I should not be able to see it with your notion of an untrusted workspace? I most certainly disagree with you and your teams decisions on this.

roblourens commented 3 years ago

Custom renderers should still work in an untrusted workspace. HTML outputs don't. Even though they are "sandboxed" in an iframe, the iframe has special privileges such that we believe it isn't safe to run explicitly untrusted scripts in it. In the html output case, this is untrusted content from the document, not from an extension.

rchiodo commented 3 years ago

I believe Google collab and jupyter run something that scrapes out javascript from any html outputs by default. Might we be able to do something similar?

roblourens commented 3 years ago

I don't see why we would show a possibly broken output. And I don't really trust any html parsing code I write for security purposes. There might be a fallback output mimetype.

rchiodo commented 3 years ago

Wouldn't custom renderers be the same thing? The data is json that drives the custom renderer. It could potentially be a security risk too?

rchiodo commented 3 years ago

I think the only truly safe output would be text/plain. Although I still think you could remove script tags pretty easy. Everything else is potentially running code on the output.

rzhao271 commented 3 years ago

I switched the workspace to untrusted, and then opened up a notebook that renders an HTML output. The HTML output seemed to still be there, but the only option when it came to switching renderers was plain text:

html-output

RandomFractals commented 3 years ago

close, but I think you should still show built-in html render and render it. not sure I like the stripping out of js idea either. do it if you must.

Those cell outputs run in an iframe in a notebook webview and is already sandboxed with CSP and rest that comes with it. what attack vectors are you trying to prevent there?

Sounds like premature optimization to me at this point until we have some more rich output renders to play with and finalize this.

roblourens commented 3 years ago

@connor4312 this regressed in https://github.com/microsoft/vscode/commit/d6d9200832af59b997adb9b3c6f0ba173786daf7#diff-54b1e3297f88bd6925ca5fe088edaf93b8aa1832936da6350d69fec68375504eL54 where we pick mimetype 0, even if it's not trusted. I don't quite follow, can you take a look?

roblourens commented 3 years ago

Thanks @connor4312.

@rchiodo it's the difference between html that comes from workspace content and html that comes from extensions.

@RandomFractals an example is that html outputs can have links that trigger vscode commands

RandomFractals commented 3 years ago

ok. I did likes and dislikes on this feature implementation, ideas, and proposed approaches with my thumbs up and down.

Please view them as agree or disagree. Thanks!

rchiodo commented 3 years ago

The json in the outputs that drives the html from the renderer can be anything though. And the renderer could theoretically be doing anything possible in a renderer (like say sending messages back to the extension to run a cell). That's why I think any json (read by a renderer, or causes code execution) in an untrusted notebook should be deemed malicious.

microsoft / vscode

Explore notebook trust + workspace trust #118584