NixOS / nixops

NixOps is a tool for deploying to NixOS machines in a network or cloud.
https://nixos.org/nixops
GNU Lesser General Public License v3.0
1.83k stars 363 forks source link

sqlite3 "attempt to write a readonly database" with legacy and memory database backend #1490

Closed Scrumplex closed 2 years ago

Scrumplex commented 2 years ago

I am currently trying to get into nixops for my NixOS installations. My host machine is Arch Linux, but I was able to reproduce this issue in the official Docker image too, so it shouldn't matter.

Tested versions

nixpkgs-unstable  nixopsUnstable (on container and Arch Linux)
nixos-21.11       nixopsUnstable (on NixOS 21.11)

Running any nixops operation that would cause some transaction with the state database I get this error:

Traceback (most recent call last):
  File "/nix/store/d4havjdz1b5i3mq7bz65jg8d151d4nzh-python3.9-nixops-2.0.0/bin/.nixops-wrapped", line 9, in <module>
    sys.exit(main())
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/__main__.py", line 56, in main
    args.op(args)
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/script_defs.py", line 188, in op_list_deployments
    with network_state(args, False, "nixops list") as sf:
  File "/nix/store/5bh6rpya1ar6l49vrhx1rg58dsa42906-python3-3.9.6/lib/python3.9/contextlib.py", line 117, in __enter__
    return next(self.gen)
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/script_defs.py", line 131, in network_state
    state = nixops.statefile.StateFile(statefile, writable, lock=lock)
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/statefile.py", line 121, in __init__
    db.execute("pragma journal_mode = wal")
sqlite3.OperationalError: attempt to write a readonly database

I can reproduce this in a container (docker.io/nixos/nix:latest).

``` FROM nixos/nix:latest RUN nix-channel --update RUN nix-env -iA nixpkgs.nixopsUnstable WORKDIR /test COPY nixops.nix /test/ CMD nixops list ``` Dockerfile
``` { network.description = "nixos"; network.enableRollback = true; network.storage.legacy.databasefile = "./deployment.nixops"; spacehub = { deployment.targetHost = "10.255.255.1"; }; } ``` nixops.nix

Both the container as well as my host machine has sandboxing disabled. This might be related?

EDIT: I installed nixopsUnstable on one of my NixOS machines and it has the exact same issue

wykurz commented 2 years ago

I'm also seeing this with current master (0c989d79), also on nixos 21.11.

roberth commented 2 years ago

@wykurz Is it the same stack trace? I would expect the original stack trace in the issue to be solved, but perhaps there's still a command that needs write access but doesn't request it. Which command did you run?

Tchekda commented 2 years ago

I even tried to delete and then re-create deployment, but still same error... When I run nixops info -d NAME i get

Traceback (most recent call last):
  File "/nix/store/d4havjdz1b5i3mq7bz65jg8d151d4nzh-python3.9-nixops-2.0.0/bin/.nixops-wrapped", line 9, in <module>
    sys.exit(main())
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/__main__.py", line 56, in main
    args.op(args)
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/script_defs.py", line 427, in op_info
    do_eval(depl)
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/script_defs.py", line 335, in do_eval
    depl.evaluate()
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/deployment.py", line 431, in evaluate
    self.evaluate_network()
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/deployment.py", line 423, in evaluate_network
    self.description = config.get("description", self.default_description)
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/util.py", line 489, in set
    self._del_attr(name)
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/deployment.py", line 255, in _del_attr
    self._db.execute(
sqlite3.OperationalError: attempt to write a readonly database
roberth commented 2 years ago
  File "/nix/store/qy61wg1sk8vwvylkc60vq7rkv3n4644q-python3-3.9.6-env/lib/python3.9/site-packages/nixops/deployment.py", line 423, in evaluate_network
    self.description = config.get("description", self.default_description)

Here it's trying to write a description to the database, which is not something it can do unless it locks the remote state for writing, which doesn't seem appropriate for commands like nixops info. It seems that the Deployment object should take a parameter so it can behave read-only when necessary.

Tchekda commented 2 years ago

Same error when I run nixops list. Other problem (don't know if it's linked to this), when I change my configuration and try to deploy, nothing happens :

building all machine configurations...
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
HOSTNAME> closures copied successfully
HOSTNAME> deployment finished successfully

Should I open a new bug ?

roberth commented 2 years ago

Should I open a new bug ?

I'm not aware of a similar issue, so yes please. Please share at least the deployment.* values.

Tchekda commented 2 years ago

I'm not aware of a similar issue, so yes please. Please share at least the deployment.* values.

Here is the config i'm trying to deploy (I know it's not perfect nix but it works) : https://github.com/Tchekda/nixos-configuration/tree/master/kbennett Nothing different from previous version :

{
  # Describe your "deployment"
  network.description = "LGP VM";

  # A single machine description.
  lgpserver = {
    deployment = {
      targetEnv = "none";
      targetHost = "IPV6_HOST";
    };

    imports = [ ./configuration.nix ];
  };
}

And my nixops.nix in current directory :

{ ... }:
{
  network.storage.legacy = {
    databasefile = "~/.nixops/deployments.nixops";
  };
}
Tchekda commented 2 years ago

Temporary fix : Revert to 35ac0208 fixes this issue. It's just before @roberth PR #1470 which creates the issue

wykurz commented 2 years ago

@wykurz Is it the same stack trace? I would expect the original stack trace in the issue to be solved, but perhaps there's still a command that needs write access but doesn't request it. Which command did you run?

Yes, exactly the same. But I think I misunderstood how nix-shell -p ... works - that's what I used:

nix-shell -p nixopsUnstable

I guess that does not include master but rather uses a version from nixpkgs... Sorry for the confusion!

To actually test with master I will need to use an AWS plugin, I'm not yet sure how to do that. Or if you could suggest when the nixopsUnstable would be updated in nixpkgs - I'd be happy to give that a go as well.

Pascal-Vtx commented 2 years ago

Same issue here, with master ( 0c989d7 ) or nixopsUnstable on nixos 21.11 ( 7ebdd8a ).

% nixops info --all                          
Traceback (most recent call last):
  File "/nix/store/rj6wah3pgf0467iifyiac5nkns6mbppm-python3.9-nixops-2.0.0/bin/.nixops-wrapped", line 9, in <module>
    sys.exit(main())
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/__main__.py", line 56, in main
    args.op(args)
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/script_defs.py", line 420, in op_info
    do_eval(depl)
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/script_defs.py", line 335, in do_eval
    depl.evaluate()
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/deployment.py", line 431, in evaluate
    self.evaluate_network()
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/deployment.py", line 423, in evaluate_network
    self.description = config.get("description", self.default_description)
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/util.py", line 489, in set
    self._del_attr(name)
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/deployment.py", line 255, in _del_attr
    self._db.execute(
sqlite3.OperationalError: attempt to write a readonly database

% nixops list                                
Traceback (most recent call last):
  File "/nix/store/rj6wah3pgf0467iifyiac5nkns6mbppm-python3.9-nixops-2.0.0/bin/.nixops-wrapped", line 9, in <module>
    sys.exit(main())
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/__main__.py", line 56, in main
    args.op(args)
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/script_defs.py", line 200, in op_list_deployments
    depl.evaluate()
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/deployment.py", line 431, in evaluate
    self.evaluate_network()
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/deployment.py", line 423, in evaluate_network
    self.description = config.get("description", self.default_description)
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/util.py", line 489, in set
    self._del_attr(name)
  File "/nix/store/997f79angsak9h8sk6dj8kqvdzkzzzay-python3-3.9.6-env/lib/python3.9/site-packages/nixops/deployment.py", line 255, in _del_attr
    self._db.execute(
sqlite3.OperationalError: attempt to write a readonly database

I have some deployments in the database but the nixops.nix in the current directory only contains :

{
  network = {
    storage.legacy = { };
  };
}
Pascal-Vtx commented 2 years ago

My understanding is that when doing a niops list or nixops info --all , attributes from the evaluated config are written to the state database. So in my case, since the nixops.nix is empty, it tries to remove the description attribute from all the deployments in the state database. By looking at the evaluate_network function, this should also happen for the enableRollback attribute.

That reveals another problem. If these writes are allowed (by reverting to 35ac020 as suggested by @Tchekda), then the description and enableRollback attributes of all deployments will be overwritten with thoses in the current nixops.nix

wykurz commented 2 years ago

@roberth is there at least a workaround for this? As I understand it - this makes using nixops on nixos 21.11 impossible?

Tchekda commented 2 years ago

@roberth is there at least a workaround for this? As I understand it - this makes using nixops on nixos 21.11 impossible?

My temporary solution : https://github.com/NixOS/nixops/issues/1490#issuecomment-987831889

wykurz commented 2 years ago

Ugh, turns out I don't know how to use the nixops from source as I need to have the aws plugin - how do I add it?

The documentation linked in the README is for nixops 1.8 which didn't have the plugins. Is this described somewhere else?

jakubgs commented 2 years ago

I guess I picked a bad day to try to use NixOps.

Tchekda commented 2 years ago

Ugh, turns out I don't know how to use the nixops from source as I need to have the aws plugin - how do I add it?

The documentation linked in the README is for nixops 1.8 which didn't have the plugins. Is this described somewhere else?

@wykurz have a look at the nix-pkgs build process, at the bottom of the file you can see all plugins being added : Nix build file

wykurz commented 2 years ago

Cool, I was able to modify nixpkgs to grab the earlier nixops like this:

diff --git a/pkgs/applications/networking/cluster/nixops/poetry-git-overlay.nix b/pkgs/applications/networking/cluster/nixops/poetry-git-overlay.nix
index 5a121cbd3ec..6766417d53f 100644
--- a/pkgs/applications/networking/cluster/nixops/poetry-git-overlay.nix
+++ b/pkgs/applications/networking/cluster/nixops/poetry-git-overlay.nix
@@ -5,8 +5,8 @@ self: super: {
     _: {
       src = pkgs.fetchgit {
         url = "https://github.com/NixOS/nixops.git";
-        rev = "0c989d79c9052ebf52f12964131f4fc31ac20a18";
-        sha256 = "07jz9grq3hjn1g9xybln5phbjhn2zsldcnan3lal6syzjggja6v1";
+        rev = "35ac02085169bc2372834d6be6cf4c1bdf820d09";
+        sha256 = "1jh0jrxyywjqhac2dvpj7r7isjv68ynbg7g6f6rj55raxcqc7r3j";
       };
     }
   );

And then install it: nix-env -f ~/projects/nixpkgs -iA nixopsUnstable

c0decafe commented 2 years ago

if you need the above as a nix expression for nix shell/flake/etc

  packages = [
     (pkgs.nixopsUnstable.override {
       ## https://github.com/NixOS/nixops/issues/1490
       overrides = (self: super: {
         nixops = super.nixops.overridePythonAttrs (
           _: {
             src = pkgs.fetchgit {
               url = "https://github.com/nixos/nixops";
               rev = "35ac02085169bc2372834d6be6cf4c1bdf820d09";
               sha256 = "1jh0jrxyywjqhac2dvpj7r7isjv68ynbg7g6f6rj55raxcqc7r3j";
             };
           }
         );
       });
     })
   ];
lostnet commented 2 years ago

Looking at the code it seems like everything treats the sqlite database as write-through which would make fixes for everything that shouldn't change state kind of painful. An alternate approach might be to let them work on a cloned memory database: https://www.sqlite.org/c3ref/deserialize.html

It looks like support for it in the sqlite3 python module is nearing completion: https://github.com/python/cpython/pull/26728

roberth commented 2 years ago

@lostnet Maybe I've been to hopeful about the state of NixOps' state management.

nearing completion

Our db is tiny, so it wouldn't really hurt to copy the file and use that instead.