Scrumplex commented 6 months ago

Add a flake-parts module for configuring deploy-rs profiles.

This makes it a lot easier for flake-parts users to use deploy-rs, as there are type checks analogous to NixOS modules.

Feel free to check the flake-parts example.

Future work

If desired, I would be happy to switch the primary flake.nix in this repo to flake-parts, to greatly simplify code, like it was proposed in https://github.com/serokell/deploy-rs/pull/229 by @drupol In that case I could additionally add descriptions as @Skarlett seems to have done in https://github.com/the-computer-club/lynx/blob/main/flake-modules/deploy-rs/default.nix and add code for generating module documentation. Of course deploy-rs would still work just as expected without flake-parts.

Scrumplex commented 6 months ago

Submitting this as a draft to gather feedback first

Scrumplex commented 6 months ago

It looks like the schema check fails with this, as unset value are exported as null or [] in the json. I am not sure if we can let the module system omit unset values (it wouldn't work if the options didn't have default values)

Skarlett commented 5 months ago

there is some niche pain points in this strategy I'd like to address.

Some pain points

Lists are collected into a single collection by the module system. So given this case where we have a default value, but wish to not use it; lib.mkForce must be used.

EDIT:

The list collection problem maybe a bit more untamed than first thought of. lib.mkForce may not be suitable, as deploy-rs does the merging of values. Perhaps an additional argument to the binary (deploy-rs/cli.rs) can override this behavior --no-merge
```
deploy = {
sshOpts = [ "-t" ];
nodes.test-vm.sshOpts = lib.mkForce [ "-T" ];
};
```
- enforcing specific ordering that contradicts the deploy.sshOpts over deploy.nodes.*.sshOpts may also need to utilize lib.mkForce, lib.mkAfter, lib.mkBefore.

deploy.nodes.*.profilePath == null set will cause errors such as "profilePath used but not declared."
Possible malicious usage. deploy.sshOpts is forced onto the flake from another flake-parts module. most likely man-in-the-middle attack. High impact. (perhaps an extra security measure, such as checksum could suffice)
There is also some module "hula-hoops" you should also consider. To avoid infinite recursion, nixos modules use a descendant set of options under config.build. This is used for automatic code-generation, afaik flake-parts doesn't provide an equivalent. If you'd like to reference the module-options, and then build new values based off of those options, its expected you do a similar thing under config.deploy.build (which shouldn't be carried into flake.deploy).

Some upsides

Having it defined in flake-parts is fairly ergonomic, the ability to spread it across different files is a big plus.
It is now possible to build utilities off of deploy options, and then share them with others.
- One idea might be generating DNS names for each host based off deploy.nodes.*.hostname then dropping that data into a nixos-module that can then be consumed by a nixosConfiguration.

My conclusion to the idea is that -- I generally like it, and think the benefits outweigh the problems.

The primary problem in the implementation of this is how we choose to communicate with our users. It is not entirely clear to beginners what a flake-module is, and the possible reasons for their usage, above the pre-existing. In some sense, we have a chicken-egg problem. If there were a large ecosystem of flake-modules, then it'd be self-explanatory, but the current state of affairs is that they're quite rare.

I would encourage the author to consider this, and help build the bridge of reason to use the ecosystem by outlining clear benefits.

I hope this helps, I'll be watching the thread as its getting worked on. I might be able to invest sometime into it, but I have a full time job, so its only about 0.01% time that I can donate.

Skarlett commented 5 months ago

The list collection problem maybe a bit more untamed than first thought of. lib.mkForce may not be suitable, as deploy-rs does the merging of values. Perhaps an additional argument to the binary (deploy-rs/cli.rs) can override this behavior --no-merge

Some after thoughts on this portion. This problem can be solved in the module system, but with some constraints.

Instead of directly passing deploy top's level (deploy.*), each node would be declared individually in deploy.nodes, which is only shown to deploy-rs.

The remaining top level options should be used in conjunction with deploy.nodes.*

In respect to deploy-rs/cli.rs, it should only see { deploy.nodes = { .. }; }. This way we do not need to make any modifications to the client to conform to this.

Possible malicious usage. deploy.sshOpts is forced onto the flake from another flake-parts module. most likely man-in-the-middle attack. High impact. (perhaps an extra security measure, such as checksum could suffice)

There is likely some unsaid primitive in nix for checksums. Imagine a case like a derivation.

{
  sshOpts = ["-t"];
  # sha2(stringConcatSep " " sshOpts)
  sshOptsHash = "b64f681b76d9dc31203b50b41af60f449ed8c657";
}

There maybe some use in not using lists, and rather strings directly, so that the type system can be used to say this can only be defined once, but we still have cases of lib.mkForce.

These ideas are more mitigations for a bigger problem, but it is good to have a focus on high impact causalities from possible supply-chain attacks. In this case, a malicious flake-input.

Some other ideas might include wrapping deploy-rs to ensure that it can only read a subset of public keys, and any nefarious writes to known_hosts/ssh-key exchange is behind bubblewrap, and won't be applied to the running session. This may be the best option, so far I've thought of.

EDIT:

Too understand the possible risk of command injection better, I've gone ahead and built a POC against the cwe-78.

Fish: fish, version 3.7.0 Nixpkgs: f480f9d09e4b4cf87ee6151eba068197125714de

sshOpts = ["eval (touch '/tmp/foo')"]

Scrumplex commented 5 months ago

I think this isn't limited to deploy-rs. If you import a malicious flake module, it could modify your devShell to add some malicious commands to the shell hook, or just replace the deploy binary with a malicious version. I am not sure if there is a perfect way to fix this. I think this also applies just as well to any other module-based configuration system, like NixOS

serokell / deploy-rs

Add flake.parts module #269

Future work

Some pain points

Some upsides