goodboy / tractor

A distributed, structured concurrent runtime for Python (and friends)
GNU Affero General Public License v3.0
269 stars 12 forks source link

Change `rpc_module_paths` -> `enable_modules`? #182

Open goodboy opened 3 years ago

goodboy commented 3 years ago

I'm starting to realize that the way we currently allow certain python modules to have their defined functions invoked over RPC is really a form of capabilities based security. Specifically:

Capabilities achieve their objective of improving system security by being used in place of forgeable references. A forgeable reference (for example, a path name) identifies an object, but does not specify which access rights are appropriate for that object and the user program which holds that reference. Consequently, any attempt to access the referenced object must be validated by the operating system, based on the ambient authority of the requesting program, typically via the use of an access control list (ACL). Instead, in a system with capabilities, the mere fact that a user program possesses that capability entitles it to use the referenced object in accordance with the rights that are specified by that capability. In theory, a system with capabilities removes the need for any access control list or similar mechanism by giving all entities all and only the capabilities they will actually need.

This is exactly how things are working currently, and I'm starting to think is very much the correct way to design distributed systems that aren't trustless. I also am beginning to believe that SC is highly compatible with CBS since each branch in the process tree is spawned from some parent who would in theory have the most caps and as you go closer to the bottom of the tree you arrive at actors with more specialized caps (which are also the services which are most directly facing the "exterior world" - aka the services that do IPC outside the host). This also means more specialize actors are supervised by more capable actors by design.

Currently the api to expose which python modules can be loaded and subsequently executed via RPC requests is determined by the rpc_module_paths kwarg to ActorNursery.start_actor().

I think enable_modules is a better name and fits better with the etymology and semantics of able in capability.

If anyone has any better ideas please chime.

parity3 commented 3 years ago

If you could tuck the CBS concept into a plugin or middleware-style module and have an alternative or bare-implementation which allows everything to be called by everyone, that would be a nice-to-have for me. For my uses I envision no need for security and/or it will be handled at a different level, ie in the functions themselves and/or via firewalls or third party RBAC like hashicorp vault/boundary, etc.

It would just be one more thing to think about / manage / get in the way for many use cases IMO.

goodboy commented 3 years ago

If you could tuck the CBS concept into a plugin or middleware-style module and have an alternative or bare-implementation which allows everything to be called by everyone, that would be a nice-to-have for me.

I think this somewhat defeats the point then 😸 If you want to allow every actor to be able to call everything then simply load enable_modules=sys.modules.keys(), however currently this will result in each actor importing all modules at startup.

It would just be one more thing to think about / manage / get in the way for many use cases IMO.

Imo it should be this way, much like SC forces you to reason about task lifetimes up front.

goodboy commented 3 years ago

You'll also find that pretty much every RPC system (especially in python) started out in "bare mode" and eventually changes to an explicit capabilities mode with the "bare mode" as an opt in - this is a more robust and safe design.

Eg. with rpyc "new style".

parity3 commented 3 years ago

I'm impartial to whether CBS is the default, but I do recognize that is the way the industry/community leans. I just think sometimes putting security first tends to tax both the documentation and library adoption. I tend to roll my eyes when looking at documentation about a new tool where the first 2 pages are about creating ACLs, generating certs and adding oauth headers, nothing to do with the task at hand (at least in the beginning). This isn't quite so bad in this case because of the decorators and, well, one can work around it with limited overhead with def remote_eval()... or better yet, just monkeypatch out the CBS code.

But having a bare mode could both serve as a debug mechanism to see if there are performance issues with the CBS implementation, and also allow someone to insert their own custom CBS model implementation if they needed to, or simply choose live dangerously (without one).

goodboy commented 3 years ago

I tend to roll my eyes when looking at documentation about a new tool where the first 2 pages are about creating ACLs, generating certs and adding oauth headers, nothing to do with the task at hand (at least in the beginning).

Lol yeah fair enough 🦖

generating certs and adding oauth headers, nothing to do with the task at hand (at least in the beginning).

Indeed, I do think our hope is that someone doesn't startup a service thinking it's protected from random remote code exec. I would rather see a new user be like "ohh i didn't enable this module" then, "ohh whatttt, random internet traffic is crashing my system".

But having a bare mode could both serve as a debug mechanism to see if there are performance issues with the CBS implementation,

In this case there's no difference since it's just a matter of whether the subactor can invoke code from a module - there is nothing changed about how the code is invoked; that's actually partially the beauty of it. ACLs are basically the opposite way to deal with the same issues but you do have to worry about overhead.

goodboy commented 3 years ago

@parity3 btw we also have a small chat room now if interested:

goodboy commented 3 years ago

I think this is mostly done once #197 lands?