Open takeda opened 5 years ago
I don't think Nix keeps track of creation date for derivations. ctime should always be 0. Perhaps with something like FUSE we could keep track of that or even things like last access time: https://groups.google.com/forum/#!searchin/nix-devel/nix-collect-garbage%7Csort:date/nix-devel/ej-21kuvCyw/0huYqAosBQAJ
There is an SQLite database, maybe that keeps track of this?
Yes, the SQLite database keeps track of the "registration time", i.e. the time the path was added to the DB:
# sqlite3 /nix/var/nix/db/db.sqlite 'select path, registrationTime from ValidPaths order by registrationTime limit 10'
/nix/store/vn6fkjnfps37wa82ri4mwszwvnnan6sk-glibc-2.25|1490003668
/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47|1490003668
/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52|1490003669
...
So the garbage collector could use this. However it's not clear whether this is a good idea because an old path may still be recently used.
(Once upon a time the garbage collector used atime
, but most systems don't maintain atime
anymore.)
Well it will still be an improvement over current behavior which purges these paths anyway. Note I'm not asking to replace current generation check, just add extra requirement before removing (perhaps an option to enable/disable it if this ruins someone's workflow).
The atime also would be nice to use if available (I don't have way to check right now, but what does system report as atime when it's not available, is it 0, current time, or some other value?) with 0 it would just work without extra code, otherwise we would ignore it.
I marked this as stale due to inactivity. → More info
Still important to me. Even if slightly flawed, discarding only what's been added >n days ago would still be very helpful and a nice medium between --gc and --delete.
Given that the world is rebuilt every month or so anyways, the chance of discarding recently used output paths is rather slim even with this flawed method.
Maybe gc could also differentiate between fixed-output (, CA) and regular drvs.
Another useful proprerty to filter for when gc'ing would be drvs that are in the build-time closure of drvs that are supposed to be kept (gcroot'd or added too recently). This could be incredibly useful for robotnix users who have to manage gcroots for build envs if they want the dozens of GiB large android sources to persist a gc but also some use-cases in NixOS.
I marked this as stale due to inactivity. → More info
This may risk going offtopic, but I'd quite like an access-time-basis for prioritizing deletions. Doing quite a lot of nix development, many things I build may not currently be in a gcroot, but it's likely that a new build will want to reference a recently accessed package again.
@risicle atimes aren't recorded since the Nix store you access as a user is readonly to you.
Mmmmm atime is certainly updated for me.
$ stat /nix/store/zxmp0hm86g25inbllb8610c9mwxglik8-libelf-0.8.13
File: /nix/store/zxmp0hm86g25inbllb8610c9mwxglik8-libelf-0.8.13
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: fe05h/65029d Inode: 1584906 Links: 5
Access: (0555/dr-xr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2022-10-09 12:09:44.741765352 +0100
Modify: 1970-01-01 01:00:01.000000000 +0100
Change: 2022-10-09 12:09:44.741765352 +0100
Birth: 2022-10-09 12:09:44.729765022 +0100
$ ls /nix/store/zxmp0hm86g25inbllb8610c9mwxglik8-libelf-0.8.13
include lib share
$ stat /nix/store/zxmp0hm86g25inbllb8610c9mwxglik8-libelf-0.8.13
File: /nix/store/zxmp0hm86g25inbllb8610c9mwxglik8-libelf-0.8.13
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: fe05h/65029d Inode: 1584906 Links: 5
Access: (0555/dr-xr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2022-11-06 14:09:51.755988282 +0000
Modify: 1970-01-01 01:00:01.000000000 +0100
Change: 2022-10-09 12:09:44.741765352 +0100
Birth: 2022-10-09 12:09:44.729765022 +0100
Ah, might be because of relatime
. It only updates atime if mtime has been modified since the last access which it obviously wouldn't have. Forgot it had that property too in addition to the once-per-day update limit.
An advantage I see of atime is it'll get updated even if your machine is sharing packages over e.g. nix serve
and a package is requested.
I've been playing with some ideas around this over at https://github.com/risicle/nix-heuristic-gc
+1 for nix-serve and nix-serve-ng, doing remote copies through nix copy
also is a problem, with nix min-free
and max-free
, it automatically gc derivations that has been just copied to the store, which makes no-sense.
Ah, might be because of
relatime
. It only updates atime if mtime has been modified since the last access which it obviously wouldn't have. Forgot it had that property too in addition to the once-per-day update limit.
This is boolean "or", not boolean "and":
relatime maintains atime data, but not for each time that a file is accessed. With this option enabled, atime data is written to the disk only if the file has been modified since the atime data was last updated (mtime), or if the file was last accessed more than a certain amount of time ago (by default, one day). (via)
So relatime (which is the default on many systems IIRC?) should be well-suited for gc removal?
I was wondering why things were being deleted!
Currently
nix-collect-garbage
has an option to not delete older entries than given amount of time. It looks like that setting only applies to generations. If someone uses nix for development and rarely usesnix-env
calling garbage collection will still wipe most of derivations. If there was an option to also check time when given store path was created and not delete anything recent it would reduce amount of packages that need to be fetched again and it would also reduce amount of data fetched from the caching server.