`lib.pushDownProperties` visibility change breaks external users

uninsane commented 1 year ago

Describe the bug

lib.modules.pushDownProperties (and adjacent functions) were made private in #197547.
this change was made under the assumption that any users outside nixpkgs could switch to more explicit APIs without loss.
at least one known use-case seems to show that assumption doesn't strictly hold.

Expected behavior

at the request of @roberth, the intent of this report is to make known any external users that are dependent on the private functions: lib.modules.{applyModuleArgsIfFunction,dischargeProperties,evalOptionValue,mergeModules,pushDownProperties,unifyModuleSyntax}. from there we can decide how to balance serving those use cases against the desire to lift internal restrictions around the modules system.

if you have a use-case which depends on these private functions, please make it known in this thread.

uninsane commented 1 year ago

copying this from the PR above:

the code i have in deployment which depends on the now-deprecated pushDownProperties involves overcoming some well-known limitations of mkMerge: config = mkMerge <list> at module scope often causes infinite recursion, e.g. when the items in <list> are computed attrsets. code like this barfs:

{ config, ... }:
{
  options.my.option = mkOption {
    type = types.listOf types.string;
  };

  config = mkMerge (map generateMoreConfig config.my.option);
  # ^ ERROR: `config.*` depends on `config.my.option` => infinite recursion
}

assume generateMoreConfig produces an attrset with assertions and users. the simple fix is:

  config = let
    genAssertions = conf: (generateMoreConfig conf).assertions;
    genUsers = conf: (generateMoreConfig conf).users;
  in {
    assertions = mkMerge (map genAssertions config.my.option);
    users = mkMerge (map genUsers config.my.option);
  };

however, this fails if generateMoreConfig includes any lib.mkIf at the toplevel, e.g.

generateMoreConfig = value: lib.mkIf value != "root" {
  assertions = ...; users = ...;
};

hence, any more robust equivalent to module-scope mkMerge has to "push down" that outer lib.mkIf:

generateMoreConfig = value: {
  assertions = lib.mkIf value != "root" ...;
  users = lib.mkIf value != "root" ...;
};

but you don't want to manually push these down. ideally one could take the original, clean-but-unsupported code:

config = mkMerge (map generateMoreConfig config.my.option);

and defeat the infinite recursion by constraining the attrnames at the outer layer -- without contorting the logic everywhere below that:

config = let
  mergeEachAttr (map generateMoreConfig config.my.option);
in {
  assertions = mergeEachAttr.assertions;
  users = mergeEachAttr.users;
};

hence, this mergeEachAttr function has to use lib.pushDownProperties. my actual deployment is a more generalized form of the above, viewable here:

helper definition: mkTypedMerge
where it's used

roberth commented 1 year ago

config = mkMerge (map generateMoreConfig config.my.option);

This has a particularly strong requirement that isn't sufficiently solved by any solution that merely derives the returned attrset structure from the values, because the number of items returned from the map can only be determined after config.my.option is known. Put differently, the list of top level config definitions must be evaluated before config.my.option, which is contradictory.

I'm confident that whenever the number of mkMerge items is variable, the module author has to specify which option trees may or may not be affected. Specifically which may, because a may not solution would not scale. These restrictions seem to be fundamental for any such abstraction that is based on lazy fixpoints like the module system. (To get away from that, you'd need a more complicated iterative process that isn't just lazy evaluation / call by need, but a constraint solver or something. This does not appear feasible for the module system. This reminds me though, that it's good to read call by need in reverse; from return value to parameters, as that's the evaluation order, although order is a bit of a misnomer for such a ~chaotic~ dynamic process. But I digress.)

The need to specify which options a merge can write to has also come up in "supermodules", which is a slightly different perspective on the same kind of data flow:

https://github.com/NixOS/nixpkgs/pull/152785

This is similar to what you describe as

defeat the infinite recursion by constraining the attrnames at the outer layer -- without contorting the logic everywhere below that

I think it's important for the module system to steer towards solutions that are general and robust. When it comes to robustness, one of the goals has to be the avoidance of unnecessary dependencies between options. Both "supermodules" and your solution suffer a bit from an incentive to be too general when it comes to specifying evaluation dependencies, as that's essentially what you're doing when you say

{
  assertions = mergeEachAttr.assertions;
  users = mergeEachAttr.users;
};

For instance, it would be more robust to specify users.users = if you're not generating any groups. That way a user configuration that uses a module like yours could set your options based on groups information without causing an infinite recursion. Arguably this is a construed example, but it illustrates the point that deeper mkMerge-es are better. (when it comes to dynamic lists of definitions; lib.mkMerge [ { users = ...; } { users = ...; } ] would turn out fine iiuc, but lib.mkMerge (builtins.seq config.anything [ ]) can already be a problem)

So I'm ambivalent about whether we should make this pattern easier. Fundamentally what the module system needs, to work well, is some sort of mechanical specification of which options may be written to, to the deepest point possible. Ideally we could infer this from the function that generates the definitions, but such an analysis is not really possible; not without breaking (out of) the function abstraction like a LISP would. So instead, we need you to specify it for the module system. While we could think of alternate representations of these attribute sets, mkTypedMerge being one of them, I think specifying multiple mkMerge definitions at the deepest points is perhaps a more obvious solution.

To resolve this issue I think we should have at least one of these outcomes

guidance in the docs on how to specify a variable number of definitions based on an option in the same option tree (and hence the recursion leads to an implementation problem if implemented naively): specify mkMerge as deeply as possible
perhaps a new function to take care of this use case if this is more ergonomic or if this is needed in some cases not covered by deep mkMerge

I suppose what's annoying about multiple mkMerge-es is that you tend to need an extra let binding and mapAttrs (x: x.<attrpath>) for each of them. Perhaps a function like mkTypedMerge could help out with that. It seems that the right hand side could be generated from the attribute path. That suggest that a syntax like this could be made to work:

mkDynamicMerge
  { # deeper is better, to avoid infinite recursion
    assertions = null;
    systemd.services = null;
    users.users = null;
  }
  myList

... where null gets translated to the corresponding retrieval from all list items.

It wouldn't hurt to keep that comment around as a reminder; very few contributors would read up on mkMerge semantics.

I think mkDynamicMerge would be nice to specify without using the attribute path operations, but inductively instead. That way we can avoid unnecessary list allocations. For this though, I think we'd need a non-recursive version of pushDownProperties, as the recursion is handled by mkDynamicMerge instead. This is a bit speculative. Maybe using attribute path list operations is appropriate, or at least a good initial implementation.

roberth commented 1 year ago

We have another issue with a highly similar use case now (#238592), which can be covered by a perhaps more user friendly function such as mkDynamicMerge above.

Also I've recently had to use it for a different use case as well

mergeAttrDefinitionsWithPrio in https://github.com/NixOS/nixpkgs/pull/238136

So although we can cover a manageable set of just 2 use cases for now, I don't feel confident that those two functions cover everything, so I'm open to undeprecating the pushDownProperties function as well.

NixOS / nixpkgs

`lib.pushDownProperties` visibility change breaks external users #231427

Describe the bug

Expected behavior