suderman / nixos

system configurations & dotfiles
12 stars 1 forks source link

Custom/Made up TLD? #13

Closed buxel closed 4 months ago

buxel commented 8 months ago

Hi @suderman ! This is more or less a follow up on #11

Having seamless 'service.hostname' domains between VPS and homelab is pretty decent!

Reading your implementation of the traefik helper i wondered if it is possible to have a more stable custom TLD. As things are always shifting (for me) it may come in handy to have a "stable" address for services, not bound to machine's hostnames. For example, anything my wife and family uses should be available at ".zz"

I played with some variations of this

  # wiki is available at "wiki.bender"

  # attempt 1
  modules.silverbullet = { enable = true; name = "wiki"; };
  modules.traefik.routers.wiki = "http://wiki.zz:80"; # ".zz" is not a hostname of any machine. It is a made up TLD
  # attempt 2
  modules.silverbullet = { enable = true; name = "wiki.zz"; };

I think this is getting pretty close. Both seem to generate the required config for traefik but blocky is missing a CNAME to properly route the request. At least that's what i assume.

I started reading the blocky code but i could not follow it all the way through. Any pointers? 😇

suderman commented 8 months ago

You can just extend Blocky's mapping in your configuration. If you have multiple Blocky servers, you'll need to repeat this extension in each configuration:

modules.blocky.enable = true;
services.blocky.settings.customDNS.mapping = { zz = "10.1.0.5"; };

...or using this.networks use the system name:

modules.blocky.enable = true;
services.blocky.settings.customDNS.mapping = { zz = this.networks.home.lux; };

With the above, so long as you have no other zz mappings in your config, Blocky will resolve zz, wiki.zz, whatever.wiki.zz to 10.1.0.5. Later, you can swap out zz's value with another IP and your host name can stay the same.

buxel commented 8 months ago

It may be my incomplete setup or my method of testing (dig/curl 'wiki.zz') but I could not get it to work - i need to come back to this once i have sorted out the "home" network. Currently, my testing is limited to a VPS and some manual messing with my router's static rules... In short: I assume some faut at my end.

But my point was a little different: i'd like to statically map specific service + domain combinations, i.e.:

# "The usual" setup
wiki.bender -> Tailnet (VPS) 
jelly.nas -> Home

# Specific mappings 
wiki.zz -> wiki.bender
jelly.zz -> jelly.nas

That way, with more servers involved, moving services between hosts, my "users" would not be affected by this. Does that make sense? Is that possible with the custom DNS mappings in blocky?

suderman commented 8 months ago

The issue stems from the fact my Traefik module doesn't recognize anything.zz as an internal host name, and won't generate a certificate or add an IP mapping for Blocky. I added a new option extraInternalHostNames to deal with this scenario, and it can be used like so:

modules.traefik = {
  enable = true;
  routers."wiki.zz" = "https://wiki.bender";
  extraInternalHostNames = [ "wiki.zz" ];
};

This creates an additional reverse proxy on top of the original wiki.bender route, but now paying attention to wiki.zz requests as well. I tested this on my own rig and it all check outs. No need to modify services.blocky.settings.customDNS.mapping with this, as the Traefik module takes care of that now.

suderman commented 8 months ago

For completeness, I'll also mention the option of using a real domain (added to CloudFlare) and allowing Let's Encrypt to handle the certificates. For a service only available via LAN or VPN, this would work:

{
  modules.traefik = {
    enable = true;
    routers."wiki.mydomain.com" = {
      url = "https://wiki.bender";
      public = false;
    };
  };
}

The false public flag is necessary because hostNames (in this case wiki.mydomain.com) with any dots in it are interpretted as external CA with public DNS. External CA (Let's Encrypt) for a real domain is good but private DNS is desirable for private routes.

I had to do a setup like this recently for Immich, because of a bug in the Android mobile application regarding custom CA's. Although I could login and use the app, background uploads were failing on account of my custom CA not being trusted. I assume this will be fixed down the road, but for now, using a certificate from Let's Encrypt solved the problem.

Alternatively, if I did want the above configuration available to the Internet, I could deploy the following on a VPS or any machine with a public IP address:

{
  modules.traefik = {
    enable = true;
    routers."wiki.mydomain.com" = {
      url = "https://wiki.bender";
      public = true;
    };
  }
}

Or another way of writing the same thing above:

{
  modules.traefik = {
    enable = true;
    routers."wiki.mydomain.com" = "https://wiki.bender";
  };
}

Since this is now public, it will also trigger an update through CloudFlare's API to create a DNS record for wiki.mydomain.com and point it this server's IP address. This way, the above route is using both CloudFlare DNS and Let's Encrypt CA instead of Blocky DNS and custom CA.

buxel commented 8 months ago

Wow, you are almost making this way too easy for me.... almost! 😉 I appreciate it.

Off topic: I'm capable of writing my own, simple modules by now but i wonder... how on earth do you iterate/debug more complex stuff? E.g. the helper functions for traefik are quite elaborate. Do you just run and test them in a server's configuration or are there other ways to test and validate? Another challenge for me is to understand where all the inputs for a module are coming from. I have used { config, pkgs, lib, this, ... } without questioning it but i feel like i'm missing an essential piece of understanding here.

Have a nice weekend!

suderman commented 8 months ago

I wish there were more tools to debug this stuff... maybe there are and I just haven't found them?

I lean heavy on nix repl (or the nixos repl wrapper in my config). Especially when focussing on building nix models, I'll explore via typing and tab completion.

For example, here I use tab to list everything available under postgresql systemd service, and then complete it with postStart and view that by hitting enter:

nix-repl> hub.config.systemd.services.postgresql.<TAB>         
hub.config.systemd.services.postgresql.after                  hub.config.systemd.services.postgresql.reload
hub.config.systemd.services.postgresql.aliases                hub.config.systemd.services.postgresql.reloadIfChanged
hub.config.systemd.services.postgresql.before                 hub.config.systemd.services.postgresql.reloadTriggers
hub.config.systemd.services.postgresql.bindsTo                hub.config.systemd.services.postgresql.requiredBy
hub.config.systemd.services.postgresql.confinement            hub.config.systemd.services.postgresql.requires
hub.config.systemd.services.postgresql.conflicts              hub.config.systemd.services.postgresql.requisite
hub.config.systemd.services.postgresql.description            hub.config.systemd.services.postgresql.restartIfChanged
hub.config.systemd.services.postgresql.documentation          hub.config.systemd.services.postgresql.restartTriggers
hub.config.systemd.services.postgresql.enable                 hub.config.systemd.services.postgresql.runner
hub.config.systemd.services.postgresql.environment            hub.config.systemd.services.postgresql.script
hub.config.systemd.services.postgresql.jobScripts             hub.config.systemd.services.postgresql.scriptArgs
hub.config.systemd.services.postgresql.onFailure              hub.config.systemd.services.postgresql.serviceConfig
hub.config.systemd.services.postgresql.onSuccess              hub.config.systemd.services.postgresql.startAt
hub.config.systemd.services.postgresql.overrideStrategy       hub.config.systemd.services.postgresql.startLimitBurst
hub.config.systemd.services.postgresql.partOf                 hub.config.systemd.services.postgresql.startLimitIntervalSec
hub.config.systemd.services.postgresql.path                   hub.config.systemd.services.postgresql.stopIfChanged
hub.config.systemd.services.postgresql.postStart              hub.config.systemd.services.postgresql.unitConfig
hub.config.systemd.services.postgresql.postStop               hub.config.systemd.services.postgresql.wantedBy
hub.config.systemd.services.postgresql.preStart               hub.config.systemd.services.postgresql.wants
hub.config.systemd.services.postgresql.preStop

nix-repl> hub.config.systemd.services.postgresql.postStart <ENTER>
"PSQL=\"psql --port=5432\"\n\nwhile ! $PSQL -d postgres -c \"\" 2> /dev/null; do\n    if ! kill -0 \"$MAINPID\"; then exit 1; fi\n    sleep 0.1\ndone\n\nif test -e \"/var/lib/postgresql/14/.first_startup\"; then\n  \n  rm -f \"/var/lib/postgresql/14/.first_startup\"\nfi\n$PSQL -tAc \"SELECT 1 FROM pg_database WHERE datname = 'me'\" | grep -q 1 || $PSQL -tAc 'CREATE DATABASE \"me\"'\n$PSQL -tAc \"SELECT 1 FROM pg_database WHERE datname = 'root'\" | grep -q 1 || $PSQL -tAc 'CREATE DATABASE \"root\"'\n$PSQL -tAc \"SELECT 1 FROM pg_database WHERE datname = 'hass'\" | grep -q 1 || $PSQL -tAc 'CREATE DATABASE \"hass\"'\n\n$PSQL -tAc \"SELECT 1 FROM pg_roles WHERE rolname='me'\" | grep -q 1 || $PSQL -tAc 'CREATE USER \"me\"'\n\n$PSQL -tAc 'ALTER ROLE \"me\" ' \n\n$PSQL -tAc 'ALTER DATABASE \"me\" OWNER TO \"me\";' \n$PSQL -tAc \"SELECT 1 FROM pg_roles WHERE rolname='root'\" | grep -q 1 || $PSQL -tAc 'CREATE USER \"root\"'\n\n$PSQL -tAc 'ALTER ROLE \"root\" ' \n\n$PSQL -tAc 'ALTER DATABASE \"root\" OWNER TO \"root\";' \n$PSQL -tAc \"SELECT 1 FROM pg_roles WHERE rolname='hass'\" | grep -q 1 || $PSQL -tAc 'CREATE USER \"hass\"'\n\n$PSQL -tAc 'ALTER ROLE \"hass\" ' \n\n$PSQL -tAc 'ALTER DATABASE \"hass\" OWNER TO \"hass\";' \n\n\n$PSQL -d \"hass\" -tAc 'GRANT ALL PRIVILEGES ON SCHEMA public TO \"hass\";'\n$PSQL -d \"hass\" -tAc 'GRANT ALL PRIVILEGES ON SCHEMA public TO \"me\";'\n$PSQL -d \"hass\" -tAc 'GRANT ALL PRIVILEGES ON SCHEMA public TO \"root\";'\n$PSQL -d \"me\" -tAc 'GRANT ALL PRIVILEGES ON SCHEMA public TO \"me\";'\n$PSQL -d \"me\" -tAc 'GRANT ALL PRIVILEGES ON SCHEMA public TO \"root\";'\n$PSQL -d \"root\" -tAc 'GRANT ALL PRIVILEGES ON SCHEMA public TO \"root\";'\n$PSQL -d \"root\" -tAc 'GRANT ALL PRIVILEGES ON SCHEMA public TO \"me\";'\n"

This way I can verify the configuration is what it's supposed to look like before actually deploying. You end up becoming very familiar with the structure of a nixos configuration and where everything is supposed to belong.

The same is true for nixpkgs lib and builtins. I use the repl to practice these functions to ensure I'm using them right in my config. I've found this website to be a great reference: https://teu5us.github.io/nix-lib.html

I also have become more familar with systemd given how much NixOS uses it. You can see what failed on the last deploy by running sudo systemctl --failed on that system. You may get something like:

sudo systemctl --failed
  UNIT                            LOAD   ACTIVE SUB    DESCRIPTION                                            
● btrbk-snapshots.service         loaded failed failed Takes BTRFS snapshots and maintains retention policies.
● flatpak-managed-install.service loaded failed failed flatpak-managed-install.service

I can use this to check out the logs or verify my config was built correctly. If I wanted the first one, the logs can be retrieved with sudo journalctl -u btrbk-snapshots. To get the config, I'll want to examine the service file useing sudo systemctl cat btrbk-snapshots

sudo systemctl cat btrbk-snapshots 
# /etc/systemd/system/btrbk-snapshots.service
[Unit]
Description=Takes BTRFS snapshots and maintains retention policies.
Documentation=man:btrbk(1)

[Service]
Environment="LOCALE_ARCHIVE=/nix/store/6xkb4i81jbchphvcdhjb6yy6xxa4sjqs-glibc-locales-2.38-27/lib/locale/locale-archive"
Environment="NODE_EXTRA_CA_CERTS=/nix/store/c4bn0zqwvmgwmqnms6fd97nivvgq2bcm-ca.crt"
Environment="PATH=/run/wrappers/bin:/nix/store/38dw8klhx6vl6di2riq3dhg9gwzd5wfw-lz4-1.9.4-bin/bin:/nix/store/d7bgwi3i9yiaf7ivsswcrzp28rz7pbwy-mbuffer-20230301/bin:/ni>
Environment="PIP_CERT=/nix/store/c4bn0zqwvmgwmqnms6fd97nivvgq2bcm-ca.crt"
Environment="REQUESTS_CA_BUNDLE=/nix/store/c4bn0zqwvmgwmqnms6fd97nivvgq2bcm-ca.crt"
Environment="TZDIR=/nix/store/4dbixfbbm2vl5jsl7xr7pbp71amf4x9r-tzdata-2023c/share/zoneinfo"
ExecStart=/nix/store/6f36zlr5ww14cdk1h0b2819h2q25gii6-btrbk-0.32.6/bin/btrbk -c /etc/btrbk/snapshots.conf run
Group=btrbk
IOSchedulingClass=best-effort
Nice=10
StateDirectory=btrbk
Type=oneshot
User=btrbk

I look for the ExecStart and can see btrbk -c /etc/btrbk/snapshots.conf. There's the conf file generated by my nix config I can verify and debug.

Also nix flake check is a good move to catch problems in every system of your configuration. Hope that helps.

suderman commented 8 months ago

Most module arguments like config, option, pkgs, lib, etc are added by the nixpkgs module system itself. Any module included in another module's imports = [ ... ]; gets all these arguments for free.

However, when creating my own NixOS configuration, I can add additional arguments using specialArgs in the flake.nix as seen below. This is how i add inputs, outputs, and my own this argument I use to encapsulate my custom functions and settings for each config. The same is true for Home Manager configurations, but in this case it's called extraSpecialArgs:

    # Make a NixOS system configuration 
    mkConfiguration = this: inputs.nixpkgs.lib.nixosSystem rec {

      # Make nixpkgs for this system (with overlays)
      pkgs = mkPkgs this;
      system = pkgs.this.system;
      specialArgs = { inherit inputs outputs; this = pkgs.this; };

      # Include NixOS configurations, modules, secrets and caches
      modules = this.modules.root ++ (if (length this.users < 1) then [] else [

        # Include Home Manager module (if there are any users besides root)
        inputs.home-manager.nixosModules.home-manager { 
          home-manager = {

            # Inherit NixOS packages
            useGlobalPkgs = true;
            useUserPackages = true;
            extraSpecialArgs = { inherit inputs outputs; this = pkgs.this; };

            # Include Home Manager configuration, modules, secrets and caches
            users = mkAttrs this.users ( 
              user: ( ({ imports }: { inherit imports; }) { 
                imports = this.modules."${user}";
              } )
            ); 

          }; 
        } 

      ]);
    };
buxel commented 8 months ago

Thanks for the writeup. The REPL helps a lot! The general concept makes a lot of sense but then again, i stub my toes on random things when i ask myself: do i ireally understand this? I.e. extraSpecialArgs = { inherit inputs outputs; this = pkgs.this; };.

My take:

How could i verify my guesswork? I tried reading nix pills but it was a little too slow for me and my hobby time is quite limited. I'd still like to broaden my knowledge, though (you are already helping here 😉)

buxel commented 8 months ago

Oh and on the topic of this "issue": OCIS does not like the custom domain, as it only works with a single domain set via: OCIS_URL = "https://${cfg.name}.${this.hostName}"; It gets confused when redirecting the login. I tried overriding it:

virtualisation.oci-containers.containers.ocis.environment.OCIS_URL = lib.mkForce "https://cloud.zz";

Which (according to the repl) should have done the trick - but it did not :/

suderman commented 8 months ago

Ah, yes, depending on the project, some self-hosted services are more picky about the hostname they're accessed by. In this case, ownCloud really only wants one route, so we'll have to adjust the original instead of adding an additional one. I've fixed my module to be a bit more flexible with the hostname. Now this should work:

  modules.ocis = {
    enable = true;
    name = "cloud.zz";
  };
  modules.traefik.extraInternalHostNames = [ "cloud.zz" ];

Now the Traefik labels will use cloud.zz when building the router. Important to include the last line so it gets a certificate generated and gets added to the Blocky mapping.

suderman commented 8 months ago

extraSpecialArgs = { inherit inputs outputs; this = pkgs.this; };.

So much of nix boils down to creating attribute sets and then deep-merging these attributes sets into one.

As for inherit, the above is really just short-form for this:

extraSpecialArgs = { 
  inputs = inputs;
  outputs = outputs; 
  this = pkgs.this; 
};

Sometimes you'll see inherit used with parenthesis, which is just another short form for assignment, but within the context inside the parenthesis. For example, if I also wanted to add pkg's stdenv to the attribute set, I could have written the above like so:

extraSpecialArgs = { 
  inherit inputs outputs; 
  inherit (pkgs) this stdenv;
};

Only way I know to verify the value of these arguments to actually assign them to a dummy option with my config and browse the repl. Maybe there's a better way?