zhaofengli / attic

Multi-tenant Nix Binary Cache
https://docs.attic.rs
Other
1.04k stars 79 forks source link

warning: 'https://cache.domain.com/prod' does not appear to be a binary cache #140

Closed mannp closed 5 months ago

mannp commented 5 months ago

warning: 'https://cache.domain.com/prod' does not appear to be a binary cache

Really not sure what I am missing :-/

The cache is private and each machine has a token to pull only.

Using a push token i can push to the cache as per the examples fine.

cole-h commented 5 months ago

How are the "pull only" tokens setup? For Nix to be able to pull from the cache, you'll need to configure a netrc-file that has that token configured for your domain (i.e. machine cache.domain.com login asdf password [the token]). I believe attic use [cache] should do this for you.

mannp commented 5 months ago

I used `attic use prod' and attic login prod https://cache.domiain.com manually for each machine wanting to use the cache for now.

This file was created below;

~~> bat ~/.config/attic/config.toml

       │ File: /home/user/.config/attic/config.toml

   1   │ default-server = "prod"
   2   │ 
   3   │ [servers.prod]
   4   │ endpoint = "https://cache.domain.com"
   5   │ token = "redacted"

... and I added this to my nix config across all machines

nix.settings = {
      substituters = ["https://cache.domain/prod"];
      trusted-public-keys = ["prod:<redacted>"];   
};

netrc-file

I don't appear to have this file, is it created in the user home or root home?

Mmh, attic use prod gives me the following error right at the end after printing the cache details;

->> attic use prod Configuring Nix to use "prod" on "prod":

Is this error related to the netrc-file? is the file /home/user/.config/attic/config.toml not enough?

Or is this trying to change configuration.nix?

Edit: going to try this -> https://github.com/zhaofengli/attic/issues/57#issuecomment-1596470368

Thanks

mannp commented 5 months ago

All confirmed in place;

nix.settings.extra-substituters = [ "https://cache.domain.com/prod" ];
nix.settings.extra-trusted-public-keys = [ "prod:xxxxxxxx" ];
nix.settings.netrc-file = config.sops.secrets.netrc-file.path;

building the system configuration... warning: 'https://cache.domain.com/prod' does not appear to be a binary cache copying 4 paths...

Watch-store copied a ton of derivations to the cache, so its for sure populated.

Only things I can think of is the extend of the token, is this correct for pull use;

atticd-atticadm make-token \
  --validity "1y" \
  --sub "prod*" \
  --pull "prod*"

Would appreciate any ideas to debug this...is that a nix warning? Thanks.

cole-h commented 5 months ago

Yes, that's a Nix warning. To see if the token works, you could try running curl https:/..../prod/nix-cache-info -H 'Authorization: Bearer [the token]' (that's basically what Nix does before showing the warning of "is not a cache")

If that responds without error, then something is wrong with your Nix configuration; otherwise, if it does error, something is wrong with the token.

The sub for the token is just a description of sorts, so the asterisk may be doing weird things.

mannp commented 5 months ago

Both the pull only and push pull tokens return;

WantMassQuery: 1
StoreDir: /nix/store
Priority: 41

Mmh, so what could be wrong with the config though, the addition of the substitutes seems a pretty benign addition, to be causing problems.

I don't get any warnings using harmonia as the cache.

cole-h commented 5 months ago

One last test to make sure your netrc is also well-formed is to do the same curl as above, but replace -H ... with --netrc-file /path/to/the/file and see if that also works.

mannp commented 5 months ago

--netrc-file /path/to/the/file

So this;?

curl https://cache.domain.com/prod/nix-cache-info --netrc-file ~/.config/nix/netrc

returns;

WantMassQuery: 1
StoreDir: /nix/store
Priority: 41

Still getting it ... wondering if its my nix version.

warning: 'https://cache.domain.com/prod' does not appear to be a binary cache

The s3 store is populating and attic-watch is defn pushing to the cache fine.

cole-h commented 5 months ago

Is that (~/.config/nix/netrc) the same path as what config.sops.secrets.netrc-file.path refers to?

If you run the command that returns "does not appear to be a binary cache" with --netrc-file ~/.config/nix/netrc, does it work? What if you set nix.settings.netrc-file = "/home/user/.config/nix/netrc"; directly instead of going through sops?

If both of those still work, then there's definitely something strange going on. I can't help much more without seeing logs (by adding -vvvvvvvv to the problematic command). If you don't want to publicize them, you might look for Server auth using Basic in the output of that and see what the problem is.

mannp commented 5 months ago

Thanks for your time and patience @cole-h I will do some more debugging tomorrow with your inputs above and see if I can understand what is going on.

mannp commented 5 months ago

Made a little progress, as 2 machines now see the cache and pull, yay, that said the other few still report the no cache error.

As they are using the same base config, I have not found a definitive reason for the issue on some machines.

mannp commented 5 months ago

@cole-h I have created a limited pull/push token only and apply that to the clients using the cache and I no longer get the messages from Nix.

I don't know how to debug the command that Nix is using that causes the error, but perhaps it is trying to push some status back to the cache and gives the message it does?

cole-h commented 5 months ago

I think you may have run into https://github.com/zhaofengli/attic/issues/133 (which was "fixed" by https://github.com/zhaofengli/attic/pull/135), or something similar.

Basically, if --pull 'prod*' wasn't the only permission you gave the token that could match the pattern of where you were pushing/pulling, it's possible that attic would sometimes read the "correct" permission from the token, and other times read the "incorrect" permission (due to random iteration order).

mannp commented 5 months ago

I see, thanks.

Wasn't that PR merged and I was still having issues though? Anyway, will keep monitoring and report anything further.

cole-h commented 5 months ago

Oh, I was assuming you hadn't updated recently (an unfair assumption on my part, sorry). It's also possible that that kinda "caused" the issue because of the "constant" iteration order, if you had another wildcard that would match before prod*, that could also explain it.