Open fgaz opened 8 months ago
Discussed this in the Architecture Team meeting:
@roberth: Potential problem with rev
being set to finalAttrs.version
: Overriding it reuses the previous source, no error
finalAttrs
fixedsrc
needs to be overriddenpkgs.hello.overrideAttrs (pkgs.changeSrc {
# ...
tag = "v1.2"; # potentially automatically strip v prefix
hash = "...";
})
pkgs.haskell.lib
has something like that
Fetchers could have a standardised attribute to pass thru the rev/tag version, so the package can reuse.
Change stdenv.mkDerivation
to use src.version
as a default.
Should I perform the migration in steps or all at once?
pkgs/by-name
), and then the whole tree if there's no problemsOne big commit or one commit per package?
Do we review all changes or only a sample?
@infinisil you are working on the by-name migration (RFC 140 migration plan #258650). Do you think this could interfere? Should I wait until your migration is complete?
- @roberth: Potential problem with
rev
being set tofinalAttrs.version
: Overriding it reuses the previous source, no error
This isn't caused nor exacerbated by finalAttrs
at all. If you override version
and not src
, even without finalAttrs
, you'll also silently use the existing sources. That's just how FODs work.
The finalAttrs
pattern in fact makes it easier to override src
, as you may only need to set outputHash
:
packageName.overrideAttrs (oldAttrs: {
version = "0.1.1";
src = oldAttrs.src.overrideAttrs {
outputHash = "...";
};
})
Without finalAttrs
, you'd still need to set src
, and more verbosely at that:
packageName.overrideAttrs (oldAttrs: {
version = "0.1.1";
src = fetchFromGitHub {
repo = "...";
owner = "...";
rev = "...";
hash = "...";
};
})
This isn't caused nor exacerbated by
finalAttrs
at all.
The problem is that it looks more correct.
Something like "oh, I see the author is aware of the rec
problem. Great, now I can override the version easily."
It's a flawed thought when you think about the hash, but if someone's in a hurry or they're distracted, and the build still works, they'll be confused:
"The build worked, so it can't be the hash."
Yes, it is scope creep, but it's quite a serious issue that is exacerbated by this, even if slightly.
Deriving the package version from the fetcher.
I think this is the most promising pattern; something like:
version = finalAttrs.src.version;
src = fetchFromGitHub {
rev = "v2.1";
# ...
};
if someone's in a hurry or they're distracted, and the build still works, they'll be confused
I doubt most users are aware of the implementation details of everything they override. If they were, surely after the first couple of "FOD moments" they'd know better?
Either way this seems more like an opportunity to document/educate. FODs are already discussed in the nix manual, so the nixpkgs manual's section on overriding might direct them there with some ancillary notes.
Deriving the package version from the fetcher.
I think this is the most promising pattern; something like:
This forces users to not use their intuition to override version
; this behaviour would be extremely surprising to encounter.
Moreover, extracting version
from src.rev
would be more tortured atm, as you can't use string interpolation; are you suggesting to change the fetcher API as well?
Yes, the idea was that each fetcher sets version
.
It could be something like
rev
is a git shastripPrefix "v" rev
if it's v[0-9].*
[0-9].*
version
attribute otherwiseI don't think fetchers currently set a version
attribute, so this should be fine.
This forces users to not use their intuition to override
version
;
At least the package won't have any code that could make them believe that overriding just version
could work.
I don't think this is an intuition, but rather an idea that occurs because they read the code wrong.
IMO: this should be done after https://github.com/NixOS/nixpkgs/pull/234651
seems like it has potential to silently break packages that don't have enough unit tests due to subtle differences in semantics.
How would he ensured that there won't be an unmanageable amount of silent breakages especially in people's overlays?
Also how would packages be filtered which would be broken by this change because they previously relied on the rec behavior?
Also it would probably be a good idea to first write a tool to verify the correctness of FOD hashes to easily spot breakages in any urls which might change because of wrong usages of pname as part of URLs/paths/binary names in combination with overlays which append suffixes like -headless
or -minimal
.
- A short rev if
rev
is a git sha
This would conflict with our current contributing guide and the -unstable-XXXX-XX-XX suffix. Also short refs are not really recommended for long term use.
stripPrefix "v" rev if it's v[0-9].*
Which would probably also need to consider refs/tags prefix and we probably collect more of those over time and could easily make it complicated.
i think packages should be updated to use finalAttrs as they're touched honestly, even if you made the commits automatically, you would still need to review them manually, and that would probably be about the same amount of work, but because github only allows you to approve a PR and not individual files of a PR, all that review work would probably have to be done by one person (actually several, a change this big should require multiple approvals)
there's lots of patterns that are inoccuous with rec
that are a disaster waiting to happen with finalAttrs
there's lots of patterns that are inoccuous with
rec
that are a disaster waiting to happen with finalAttrs
I'd agree, if this tool didn't account for the main point of rec
misuse: name
/pname
references. I was wary myself before testing the tool - and I think fgaz
undersold the considerations put in to design the tool - but I currently think it is robust enough to at least initiate a PR.
Give it a try on some individual packages :)
I'd agree, if this tool didn't account for the main point of rec misuse
but we also need to look at the less often used misuses, otherwise we could run into a situation where a tiny fraction of the changes, still takes a significant amount of the time to fix things afterwards. I don't have good idea, how we could tell apart packages which are very safe to change and which will likely cause issues. Maybe we would need to analyze the AST and can then categorize the packages based on that and start with the very safe packages and do the complicated ones by hand.
I skimmed through the source code and I think it currently only checks pname which is in my opinion not enough and the to do list should get an entry to develop the tool and write good tests for it.
Also the following comment from the readme makes me think that this is not rock solid and deeply tested:
Warning: Even though the library checks that derivations did not change, skimming the output for errors is a good idea. The program could have incorrectly changed something in meta, passthru, or an expression that is only evaulated under certain conditions.
Without
finalAttrs
, you'd still need to setsrc
, and more verbosely at that:packageName.overrideAttrs (oldAttrs: { version = "0.1.1"; src = fetchFromGitHub { repo = "..."; owner = "..."; rev = "..."; hash = "..."; }; })
This can already be shortened with override https://github.com/NixOS/nixpkgs/blob/01048ff3150ac438b9ffac42fb1ab8054fcae9d8/pkgs/applications/finance/odoo/default.nix#L14-L20
Indeed I forgot to mention the measures I took to ensure a robust migration:
meta
and passthru
).pname
misuse is detected, the package is skipped.
version
, buildInputs
, ...) insteadAlso the following comment from the readme makes me think that this is not rock solid and deeply tested:
Warning: Even though the library checks that derivations did not change, skimming the output for errors is a good idea. The program could have incorrectly changed something in meta, passthru, or an expression that is only evaulated under certain conditions.
That text is outdated, I'll remove it. meta
and passthru
are now checked. Regarding conditionals, as far as I can see they don't really hide undefined references from the same file (I will check).
write good tests for it
I should definitely add some. I've been using nixpkgs itself to test the tool, and I can extract some golden tests from it.
but we also need to look at the less often used misuses
Maybe I can have the tool output a list of all the used recursive variable names?
I think this is the most promising pattern; something like:
version = finalAttrs.src.version; src = fetchFromGitHub { rev = "v2.1"; # ... };
I thought about this, and I agree with @SuperSandro2000 that it isn't the right solution.
src
depends on version
, not the oppositesrc
very often isn't as simple as fetchgit { rev = "v${version}", ... }
, and extracting version
from src
can be a lot more complex than the opposite.
fetchurl { url = "https://example.org/foo-1.2.3-1.tar.gz", ... }
requires a regexsrcs
?git grep 'replaceStrings \[ *"." *\] \[ *"" *\]'
in nixpkgsrev
is "v${version}"
, but I think this is too much of a POLA violationThe problem of the irrelevant FOD hashes can be solved by embedding version
or rev
into the name
of the FOD. The store path will change every time the version changes, forcing a re-fetch.
We could make fetchers support pname
and version
, and provide something like tagPrefix
to enable auto-construction of rev
if not otherwise specified. The down side is that the file extension would be lost, which might break some tryUnpack
hooks. To fix this, nameSuffix
could optionally be specified.
The problem of the irrelevant FOD hashes can be solved by embedding
version
orrev
into thename
of the FOD. The store path will change every time the version changes, forcing a re-fetch.We could make fetchers support
pname
andversion
, and provide something liketagPrefix
to enable auto-construction ofrev
if not otherwise specified. The down side is that the file extension would be lost, which might break sometryUnpack
hooks. To fix this,nameSuffix
could optionally be specified.
I made a PR #294068 to demonstrate this solution.
The problem of the irrelevant FOD hashes can be solved by embedding version or rev into the name of the FOD. The store path will change every time the version changes, forcing a re-fetch.
just want to let you know this has been tried several times before.
The problem of the irrelevant FOD hashes can be solved by embedding version or rev into the name of the FOD. The store path will change every time the version changes, forcing a re-fetch.
just want to let you know this has been tried several times before.
Are there some PR about these attempts? What obstacles did they encounter?
here's a few attempts at solving the fixed-output derivation problem
NixOS/nix#7999 NixOS/nix#969
Thank you!
here's a few attempts at solving the fixed-output derivation problem
NixOS/nix#7999 NixOS/nix#969
Those proposal targets Nix itself, while mine works on the fetchers inside Nixpkgs and doesn't affect the behavior of Nix.
see also NixOS/rfcs#171: Default name of fetchFromGithub FOD to include revision
@ShamrockLee also note that nix flakes seem to have implemented some kind of workaround for this, as i haven't seen this problem occur there.
@ShamrockLee also note that nix flakes seem to have implemented some kind of workaround for this, as i haven't seen this problem occur there.
@lolbinarycat I haven't dug deep into the C++ implementation of flake inside Nix, but since flake.lock
includes the input specifications (including revision specification) as nodes.<input>.original
, it's quite easy for compare such specifications between flake.nix
and flake.lock
, and re-fetch upon mismatch.
At Nix level, they can see the original store derivation (/nix/store/*.drv
). and have space to decide if they would like to re-realize the outPath
s or just reuse them, after the ovewriting the store derivation. The latter is the current implementation. The former serves as a universal solution to the FOD re-fetching issues, but making the "original store path" a factor to determine the realization approach inevitably complicates the realization behavior and make Nix realization more "stateful".
At Nixpkgs level, all we have is the current specification, nothing else. All we could use to invalidate previous store paths are new store paths. As the drvPath
of a FOD is determined by both name
and outputHash
and that the outputHash
is specified by the user, our best bet is to code the revision into the name
attribute. That's what #294068 and NixOS/rfcs#171 is about. (The review process of #294068 is paused in favour of NixOS/rfcs#171.)
The
finalAttrs
pattern in fact makes it easier to overridesrc
, as you may only need to setoutputHash
:packageName.overrideAttrs (oldAttrs: { version = "0.1.1"; src = oldAttrs.src.overrideAttrs { outputHash = "..."; }; })
:thinking: So here oldAttrs
is actually finalAttrs
which picks the new overridden version
? It seems weird to me, if it's not oldAttrs
, why does src = finalAttrs.src.overrideAttrs ...
not causing an immediate infinite recursion? If it's actually not oldAttrs
, we also need to update all relevant documentations.
@oxalica oldAttrs
here is prevAttrs
not finalAttrs
https://nixos.org/manual/nixpkgs/stable/#sec-pkg-overrideAttrs
@oxalica no, oldAttrs
in that example is prevAttrs
, not finalAttrs
. Here's a full example that should hopefully demonstrate, why this works:
# hello_2_10/package.nix
{ stdenv, fetchurl }:
stdenv.mkDerivation (finalAttrs: {
pname = "hello";
version = "2.10";
src = fetchurl {
url = "mirror://gnu/hello/hello-${finalAttrs.version}.tar.gz";
hash = "sha256-MeBmE3qWJnbon2nRtlOC3pWn732RS4y5VvQepy4PUWs=";
};
})
# some_overlay.nix
hello_2_12 = hello_2_10.overrideAttrs (
finalAttrs: prevAttrs: {
version = "2.12";
src = prevAttrs.src.overrideAttrs {
outputHash = "sha256-zwSvhtwIUmjF9EcPuuSbGK+8Iht4CWqrhC2TSna60Ks=";
};
}
);
The reason, why you don't have to manually propagate version
to src
is because the original derivation used finalAttrs.version
when specifying the url
in fetchurl
(or something like rev = "v${finalAttrs.version}"
in the case of fetchFromGithub
).
Edit: sniped, lol
yeah the arguments are opposite in overrideAttrs
and mkDerivation
mkDerivation
final is first
overrideAttrs
prev is first
yeah the arguments are opposite in
overrideAttrs
andmkDerivation
mkDerivation
final is firstoverrideAttrs
prev is first
I wouldn't say "opposite". Here are the valid "signatures":
{
# Old-style signatures
# You can still use them, but they don't cooperate well with multiple overrideAttrs
# rec { } was commonly used with these signatures, but now it's considered an antipattern
_ = mkDerivation { };
_ = overrideAttrs { };
_ = overrideAttrs (prevAttrs: { });
# New-style signatures
# (added in PR 119942)
_ = mkDerivation (finalAttrs: { });
_ = overrideAttrs (finatlAttrs: prevAttrs: { });
}
mkDerivation
takes zero or one argument (it never takes prevAttrs
, because there is no such thing as "previous" attributes, when creating the derivation from scratch) and overrideAttrs
takes zero, one or two arguments.
I recently wrote a tool that automatically performs a migration to the
finalAttrs
pattern (#119942), and I believe it is now robust enough to be let loose on nixpkgs. However this is the first large treewide change I'm performing, so I'm not 100% sure how to proceed. Some questions:cc @drupol who was interested in this
Example run of the tool on pkgs/applications and pkgs/by-name: https://github.com/fgaz/nixpkgs/commits/finalattrs-test-8/