Mac kext: Renaming/moving of virtualisation roots is not handled correctly

pmj commented 5 years ago

While reviewing the kext code for potential security issues in preparation for #528, I noticed that we store the vnode pointer and the path string for each virtualisation root, and that the path is used for relative path calculations.

This means that if the virtualisation root directory is renamed/moved, the path will no longer match the vnode, and the relative path calculation will be incorrect. I'm about to push a change that detects the issue at the string handling level to avoid the potential buffer overrun, but we should decide how we want to avoid it in the first place. (For example, we could disallow renaming virtualisation roots with active providers by catching the attempt in the vnode listener and denying it.)

pmj commented 5 years ago

After thinking about this some more, I've now realised that blocking renames of the virtualisation root isn't going to fly, as renaming parent directories of the root will still cause the same problem, and are going to be a lot harder to match up to the root - and this behaviour also has the potential to annoy users or break other software.

So my current plan is to detect relative path calculation failures and re-evaluate the path for that root when they happen. Thoughts?

wilbaker commented 5 years ago

So my current plan is to detect relative path calculation failures and re-evaluate the path for that root when they happen. Thoughts?

I think that sounds like the right long term approach. How big of a change do you think that will be?

I ask because root renames (or renames of their parents) are not currently a supported scenario, and user-mode changes will be required to fully support them. Right now at a minimum we should make sure that our kext doesn't trigger panics or negatively impact I/O outside of our roots.

If the cost of the minimal changes and the proper fix you're suggesting are about the same I think it makes sense to go with the proper fix. If the proper fix will be more expensive then I think it can be deferred until later.

cc: @jrbriggs and @sanoursa for their thoughts as well.

pmj commented 5 years ago

The minimal changes to detect the problem and defend against it by erroring out are already implemented in #545 and appear to work in my tests. They're essentially a requirement for the proposed more proactive fix.

I think on the kext side this more advanced fix is pretty straightforward actually: if VirtualizationRoot_GetRootRelativePath() returns a nullptr we need to call a new function which

Acquires the lock for the root
Grabs a reference to the root vnode
Drops the lock
Gets the new path for the root vnode
Re-acquires the root lock with exclusive access
Stores the new root vnode path in the virtualisation root data structure
Drops the lock again.

Once that's complete, we can re-run VirtualizationRoot_GetRootRelativePath()

There are 2 more potential issues here:

If the root or a parent of the root is renamed twice in short succession, and the vnodePath we're passing to VirtualizationRoot_GetRootRelativePath() is based on the second name, whereas the path update we perform in the new function ends up using the third name. The subsequent call to VirtualizationRoot_GetRootRelativePath() will then fail again but not because the root path is wrong, but the vnode path. So I think we need to try regenerating the vnodePath as well in that case.
If the vnode is moved out of the root we decided it was in while we were executing our handler, no amount of root path updates will fix the problem. In this case, we should probably retry the entire handler logic from scratch.

Finally, when we detect a root path change, we should probably send a message to the provider notifying it of the change.

wilbaker commented 5 years ago

@pmj thanks for the detailed write up and for calling out the potential issues we'll need to address.

From a priorities standpoint I think we should hold off on the advanced fix until we're ready to invest in fully supporting a "rename root" feature. It looks like getting there requires a non-trivial amount of work (outside of the items you've called out we'd need additional changes to the provider and several significant additions to our test suite).

I'm inclined to stick with the minimal changes you've made in #545 and defer the more advanced fix until later.

CC: @jrbriggs and @sanoursa for their thoughts as well.

pmj commented 5 years ago

I actually put together an implementation of the described method yesterday. I've only today got around to testing it though, and it seems to work fine, although of course user space doesn't have support for it.

I've pushed that branch here: https://github.com/pmj/VFSForGit/commits/refresh-root-path

The crucial commit is this one

microsoft / VFSForGit

Mac kext: Renaming/moving of virtualisation roots is not handled correctly #544