atuinsh / atuin

✨ Magical shell history
https://atuin.sh
MIT License
19.55k stars 535 forks source link

Pool timed out while waiting for an open connection, ZFS #952

Closed happenslol closed 1 year ago

happenslol commented 1 year ago

Edit:

This has become the canonical issue for Atuin/ZFS issues

If you're using ZFS with Atuin, you have likely noticed an error such as the following:

Error: pool timed out while waiting for an open connection

Location:
    /home/runner/work/atuin/atuin/crates/atuin-client/src/record/sqlite_store.rs:48:20

This is due to an issue with ZFS and SQLite. See: https://github.com/openzfs/zfs/issues/14290

There are two workarounds

  1. Use the Atuin daemon

This has not yet been released as stable, however is mostly without issue. The daemon takes all SQLite writes off of the hot path, therefore avoiding the issue.

Follow the steps here: https://github.com/atuinsh/atuin/issues/952#issuecomment-2121671620

  1. Create an ext4 zvol for Atuin

Follow the following steps: https://github.com/atuinsh/atuin/issues/952#issuecomment-1902164562


I've just begun using atuin, and I absolutely love it so far. However, there's been a recurring issue for me, which I've found hard to diagnose:

My prompt regularly blocks for between 500ms to 5s whenever I run a command. I've narrowed this down to the _atuin_preexec function, by manually importing the shell hook generated from atuin init zsh and annotating it with logging and time calls. Here's a sample time call from a time where it hang:

Running pre-exec for cd ~

0.00user 0.00system 0:04.93elapsed 0%CPU (0avgtext+0avgdata 8192maxresident)k
52036inputs+1064outputs (15major+512minor)pagefaults 0swaps

Pre-exec done for cd ~

Here's how I modified the hook to get the result:

_atuin_preexec() {
    log "Running pre-exec for $1\n" >> /tmp/atuin.log
    local id
    id=$(/usr/bin/time -a -o /tmp/atuin.log atuin history start -- "$1")
    export ATUIN_HISTORY_ID="$id"
    echo "\nPre-exec done for $1" >> /tmp/atuin.log
}

I've tried to replicate the behavior in cli use outside of the hook using hyperfine, and was successful:

» hyperfine -r 1000 "atuin search --limit 5"
Benchmark 1: atuin search --limit 5
  Time (mean ± σ):      18.3 ms ± 114.8 ms    [User: 4.9 ms, System: 8.2 ms]
  Range (min … max):    12.5 ms … 2587.9 ms    1000 runs

This does not happen on every benchmark, even with 1000 runs. My initial thought was that this has to be contention on the database file, but I saw that you're already using WAL, so concurrent writes/reads should not be a problem. I can also trigger the delay by repeatedly opening the search widget, which should not even be doing writes to the database, which confuses me even more.

Do you have any idea on how I could gather further data on this?

conradludgate commented 1 year ago

Do you use the sync service? It's possible it's getting stuck and timing out (we run it occasionally in the pre-exec). Try ATUIN_LOG=debug atuin sync and see if there's anything interesting

happenslol commented 1 year ago

Yeah, I'm syncing with my own server. What exactly am I looking for in the debug output? It seems to be running fine for me, does a whole lot of queries and seems to consistently complete in around 2.5 seconds.

If the sync process is ever run in history start or history end, shouldn't that happen in a background process? Feels weird to have a process that can potentially block running in preexec. Maybe I'm just not understanding correctly, though.

ellie commented 1 year ago

It absolutely should not be blocking in start, and only ever runs a sync in "end" while forked to the background

What sort of disk are you using for your XDG_DATA_HOME/home directory?

The only thing that can block pre exec is writing a small amount to disk, which shouldn't be an issue unless your disk is slow or high latency

happenslol commented 1 year ago

That's weird then, the delay only ever happens in preexec. I'm on an NVME m.2 disk (Samsung 980 Pro), so that really shouldn't be the bottleneck. I'm on ZFS, that's the only thing that's sort of nonstandard. I don't think that should have any impact though.

Additionally, the delay can also occur in something like atuin search, that should not write anything at all afaik?

ellie commented 1 year ago

Ahhh I see, ZFS.

We've had a few people have issues there

https://github.com/NixOS/nixpkgs/issues/169457 is relevant, I believe

happenslol commented 1 year ago

Interesting, would not have though that to even be an option, I haven't had that kind of issue yet. I'll dig more into this, thanks for the reference.

Is there anything we can do here, still? Otherwise, feel free to close this issue.

happenslol commented 1 year ago

Just tried out the kernel patches suggested in the issue you linked, and they have not made any difference. The delays still occur in pretty much exactly the same way, both for the preexec command, and the outliers happening in my benchmark are also still there. Not sure if it's ZFS or something else.

Could I enable more logging to find out where exactly the hangs occur? I'm not sure how much debug logs you provide, and I suppose I would have to find some way to redirect the logs somewhere.

amarshall commented 1 year ago

I’m seeing this as well. I did an strace on the shell and there’s an ftruncate call on history.db-shm that sometimes takes several seconds, and so seems to be the culprit.

I am using ZFS as well, and moving the history db to a tmpfs resolves. Switching to a Kernel with PREEMPT=y did not solve. There is a ZFS bug report already about SQLite and ftruncate (https://github.com/openzfs/zfs/issues/14290).

I think it’s reasonable to conclude that this isn’t Atuin’s doing.

happenslol commented 1 year ago

I see, good to know where it comes from. I'll close this then. Thanks to all you guys for your quick responses!

dbaynard commented 1 year ago

I'm on an NVME m.2 disk (Samsung 980 Pro), so that really shouldn't be the bottleneck. I'm on ZFS, that's the only thing that's sort of nonstandard.

Thanks for taking the time to debug this — I’ve made similarly questionable life choices, and I feared it was a (the) firmware issue.

Did you resolve the issue, @happenslol ?

happenslol commented 1 year ago

Hey @dbaynard, I resolved the issue by creating a tmpfs on which the database is saved, so I'm just bypassing ZFS by saving it in RAM. Then, I have a litestream service running in the background, which syncs the database back to the filesystem continuously and restores it on first startup. Probably a little overkill, but it was really easy to set up. Hit me up if you want to see how I did it.

However, recently, the issue came back and I haven't had the time to debug it yet. No idea what's causing it, since my database is still in RAM - either it's not actually writing to that database, or litestream is causing some delays, which I don't think is the case... I can post another response here when I've had the time to properly debug it, though.

dbaynard commented 1 year ago

That makes sense. I'll check strace, then, at some point, and confirm it's the same issue as above (truncate on history table file). If that is the case, I might see about using systemd to call truncate on a timer, and move that call off the hot path — I don't know how atuin works to know whether that is appropriate, though.

happenslol commented 1 year ago

I actually just got around to it - the problem is that another database was added (records.db), which cannot be relocated as of this moment, so the problem exists again. I solved it by transferring the entire folder to/from a tmpfs on startup and shutdown using systemd and rsync, and I'm basically accepting the risk of losing some data on a sudden shutdown.

I'm 99% sure that if you're on ZFS, you're hitting the same problem - maybe you can describe your issues some more so we can be sure?

mattico commented 1 year ago

I resolved the issue by creating a tmpfs on which the database is saved, so I'm just bypassing ZFS by saving it in RAM. Then, I have a litestream service running in the background, which syncs the database back to the filesystem continuously and restores it on first startup. Probably a little overkill, but it was really easy to set up. Hit me up if you want to see how I did it.

That's pretty clever! I might try that.

The solution I came up with was to create an empty file in /tmp and bind mount that file to just the .db-shm file. This works but is tricky to initialize automatically because both the source and destination need to exist for the bind mount to work and you need root to mount stuff. I wrote a systemd service which works but only manually, not on startup I think because the ZFS target doesn't wait for all datasets to be mounted so the file created in home gets covered up by subsequent dataset mounts. Or something like that. Also a lot of utilities don't understand that you can bind mount individual files and the error messages are not helpful.

Nemo157 commented 10 months ago

This somehow seems to be related to zfs_txg_timeout (https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/ZIO%20Scheduler.html), when changing that to 20s I observed the delays start being random between 0-20s. (But watching various zpool iostats I never saw any queued transaction groups).

arcuru commented 10 months ago

I've been experimenting with fixing this by just forking the DB write into a background process, but it causes other issues that I haven't resolved yet. -> https://github.com/arcuru/atuin/tree/fork-dbwrite

arcuru commented 10 months ago

Oh yea, I should also point out that since this is caused by ZFS sync then the easy solution is to just set sync=disabled for the dataset that the local atuin DB resides on. It's not something Atuin can do itself but if you're willing to fix it on your end then it should be fairly easy.

I'm not sure exactly what's going on (since my assumption was that ZFS would prioritize the synchronous write, but maybe that's wrong) but disabling sync does seem to resolve the problem for me. Here's an article if you want to get a better understanding of what that actually is: https://jrs-s.net/2019/05/02/zfs-sync-async-zil-slog/

Because it does cost some safety, you'll probably want a separate dataset just for the atuin DB to limit exposure in case of a failure.

Mic92 commented 9 months ago

No kink-shaming, but this is what I am testing now: https://github.com/Mic92/dotfiles/blob/a7004dd28161bf9bda37d0e7b11a114e90e69155/home-manager/pkgs/atuin/0001-make-atuin-on-zfs-fast-again.patch

mattico commented 9 months ago

@Mic92 I tried that a while back

https://github.com/mattico/atuin/commit/1c6514aac11963a5b1a9b0a1bc7fa5f3c397f18a

It seemed to be working for a while and then I started experiencing the issue again. I may have been making some other mistake like not running the correct (patched) atuin binary due to changing .bashrc or something, but I concluded that it wasn't sufficient.

Let us know if it works!

Mic92 commented 9 months ago

After two days of testing, I no longer see hangs with https://github.com/Mic92/dotfiles/blob/a7004dd28161bf9bda37d0e7b11a114e90e69155/home-manager/pkgs/atuin/0001-make-atuin-on-zfs-fast-again.patch

arcuru commented 9 months ago

Ah that's the setting I wasn't able to find. That should be doing the exact same thing as sync=disabled, just on the atuin side.

@ellie @conradludgate - would you be willing to take this change after some more testing? As an explanation from the sqlite docs

With synchronous OFF (0), SQLite continues without syncing as soon as it has handed data off to the operating system. If the application running SQLite crashes, the data will be safe, but the database might become corrupted if the operating system crashes or the computer loses power before that data has been written to the disk surface. On the other hand, commits can be orders of magnitude faster with synchronous OFF.

In my view there is limited risk in the database becoming corrupt, because the client can always resync from the server.

conradludgate commented 9 months ago

I think that sounds ok

ellie commented 9 months ago

Ah that's the setting I wasn't able to find. That should be doing the exact same thing as sync=disabled, just on the atuin side.

@ellie @conradludgate - would you be willing to take this change after some more testing? As an explanation from the sqlite docs

With synchronous OFF (0), SQLite continues without syncing as soon as it has handed data off to the operating system. If the application running SQLite crashes, the data will be safe, but the database might become corrupted if the operating system crashes or the computer loses power before that data has been written to the disk surface. On the other hand, commits can be orders of magnitude faster with synchronous OFF.

In my view there is limited risk in the database becoming corrupt, because the client can always resync from the server.

I'd rather not enable this for everyone - reducing the reliability of Atuin even in edge cases isn't something I'd be that happy with. Particularly as ZFS users are in the minority, and honestly my experience with ZFS is that you do need to configure things for various applications and things don't always work out of the box. I feel extra strongly about this given that this is due to a ZFS bug.

There are two options as I see it, doing both would work too

  1. We document this clearly, state the workaround @arcuru listed before of sync=disabled
  2. We add a nosync_writes somewhere in the config to enable this change. Make it opt in.

But otherwise, I am quite against compromising on safety unless there is no other choice. With the usage numbers we have, this will almost certainly affect at least a few people who have not synced, and I'd like to avoid losing history wherever possible. Also bear in mind not all systems running Atuin have reliable power, internet, etc.

dbaynard commented 9 months ago

Reluctantly, I'd recommend (1): changing the zfs side.

As annoying as it is to have to set up a new dataset, it's pretty cheap, and the consequences of losing the last 5s of writes with sync=disabled (IIUC) seem negligible for this use case.

In addition to documentation, it would be great to detect when atuin is configured to store the database on a zfs system and warn the user, somehow, too — I suppose we all identified that atuin was the source of the problem, and found this issue, but there may be users who haven't made it this far (my first fear was an ssd issue).

I am quite against compromising on safety unless there is no other choice.

💯 I am a zfs user who deliberately hasn't set up sync and I'd quite like to be able to rely on the sqlite guarantees around corruption. I've experienced recent OS crashes on both linux and macos (though I don't use zfs for the primary partition on the latter).

On the other hand, I do have auto-snapshotting, so there's an argument for disabling sync sqlite, and relying on the zfs snapshots for recovery.

I've been experimenting with fixing this by just forking the DB write into a background process, but it causes other issues that I haven't resolved yet. -> https://github.com/arcuru/atuin/tree/fork-dbwrite

This seems like the best approach, but the url 404s, and there's nontrivial engineering time, so it would be good to get the ergonomics of the workaround right.

happenslol commented 9 months ago

For what it's worth, I'm also against setting this as the default. However, what's keeping us from adding an option that lets users enable this and documenting it? That seems like the best of all worlds - no extra datasets, no potentially harmful defaults for the general userbase, no workarounds or patches required for the ZFS users.

ellie commented 9 months ago

For what it's worth, I'm also against setting this as the default. However, what's keeping us from adding an option that lets users enable this and documenting it? That seems like the best of all worlds - no extra datasets, no potentially harmful defaults for the general userbase, no workarounds or patches required for the ZFS users.

Yep, this is my preferred approach

dbaynard commented 9 months ago

I suggested avoiding (2) only to avoid having the unsafe code at all — but I'm not a contributor (yet!) so I don't know the culture, etc, around this sort of thing.

ellie commented 9 months ago

but I'm not a contributor (yet!)

Let us know if we can do anything to help!

so I don't know the culture, etc, around this sort of thing.

I'm not that keen on having this in the codebase, but I do agree that as a workaround it's ok for now. As soon as the ZFS bug is resolved I'd like to deprecate our workaround though.

We will have to make sure it's clear that this is a workaround, and should only be enabled if creating a separate dataset is undesireable for whatever reason.

dbaynard commented 9 months ago

Let us know if we can do anything to help!

Introduce a load of bugs…

The only help-wanted that jumped out has a comment saying it's waiting to be merged, but I'll keep an eye out. I suppose there's #98 if I can reproduce, though I can't make it a priority.

Mic92 commented 9 months ago

No kink-shaming, but this is what I am testing now: https://github.com/Mic92/dotfiles/blob/a7004dd28161bf9bda37d0e7b11a114e90e69155/home-manager/pkgs/atuin/0001-make-atuin-on-zfs-fast-again.patch

Actually synchronous was not enough the problem comes from the WAL journal, which is the only operation that seems to use ftruncate in sqlite. After switching to the in-memory journal option responsiveness becomes better: https://github.com/Mic92/dotfiles/blob/main/home-manager/pkgs/atuin/0001-make-atuin-on-zfs-fast-again.patch I haven't tried re-enabling synchronous after that. I personally don't mind potentially corrupting my database this way as I have autosnapshots every 15 min, but I can see how as a project maintainer you don't have want have to deal with issues coming from using weird data corruptions.

branchmispredictor commented 7 months ago

I resolved the issue by creating a tmpfs on which the database is saved, so I'm just bypassing ZFS by saving it in RAM. Then, I have a litestream service running in the background, which syncs the database back to the filesystem continuously and restores it on first startup. Probably a little overkill, but it was really easy to set up. Hit me up if you want to see how I did it.

For those interested in this approach, here's what I did to replicate this. No guarantees it works for you or on your system. I installed litestream and screen, changed by atuin config to point to /tmp/${USER}-atuin-db/history.db, then modified my .bashrc from

[[ -f ~/.bash-preexec.sh ]] && source ~/.bash-preexec.sh
eval "$(atuin init bash)"

to

function fix_atuin_zfs() {
  # Can remove after https://github.com/openzfs/zfs/issues/14290 gets fixed
  # also need to set ~/.config/atuin/config.toml:db_path = /tmp/$USER-atuin-db/history.db (replace $USER with your username in config.toml)
  tmpfs_db_path="/tmp/$USER-atuin-db"
  tmpfs_db_file="$tmpfs_db_path/history.db"
  litestream_backup_path="$HOME/.local/share/atuin/history-db-litestream"
  if mkdir "$tmpfs_db_path" 2>/dev/null; then
    # Need to copy over the DB to tmp dir + run litestream
    if [ -d "$litestream_backup_path" ]; then
      # We've already been using litestream, use it as the source of truth for history.db
      litestream restore -o "$tmpfs_db_file" "file://$litestream_backup_path" > /dev/null 2>&1
    else
      # Migrate over the initial history.db from atuin to tmpfs
      cp ~/.local/share/atuin/history.db* "$tmpfs_db_path/"
    fi
    # Run litestream replication in the background
    screen -S litestream-atuin-$USER -dm litestream replicate "$tmpfs_db_file" "file://$litestream_backup_path"
  fi
}

fix_atuin_zfs
[[ -f ~/.bash-preexec.sh ]] && source ~/.bash-preexec.sh
eval "$(atuin init bash)"
norpol commented 7 months ago

If you want to stay on zfs a workaround is to create a small zvol, format it with ext4 and that fixes the performance problems for me too.

sudo zfs create -V 500MB rpool/nixos/atuin
sudo zfs list -o name,encryption | grep atuin # ensure encryption for the zvol
sudo mkfs.ext4 /dev/zvol/rpool/nixos/atuin
mv ~/.local/share/atuin ~/.local/share/atuin-backup
mkdir ~/.local/share/atuin
sudo mount /dev/zvol/rpool/nixos/atuin ~/.local/share/atuin
sudo chown -R ${USER}: ~/.local/share/atuin
cp -rv ~/.local/share/atuin-backup/* ~/.local/share/atuin 
atuin-bot commented 7 months ago

This issue has been mentioned on Atuin Community. There might be relevant details there:

https://forum.atuin.sh/t/the-atuin-daemon/78/1

kbknapp commented 3 months ago

I seem to have hit this issue as well. I asked two of what I would consider the world's foremost experts in ZFS specifically about this and they both had some very interesting observations and thoughts. I'll try to paraphrase what I understood from them:

Looks like atuin is doing a full filesystem sync in every _atuin_preexec; i.e. "Please FS make this write as safe as possible!" which on ZFS is kind of like doing a zpool sync. The reason this "works" on every other filesystem is because they effectively lie when you call sync whereas ZFS does not and will actually force the entire FS to sync before returning.

Possibly also related, in ZFS there is a module parameter zfs_txg_timeout that is set to 5s by default and it's purpose is to flush the ZIL. So it's possible atuin is waiting for the ZIL to be flushed on each _atuin_preexec.

For ZFS it's unnecessary, and potentially harmful to greater system performance to ask for a full FS sync at each command, it should just be written to the ZIL and let ZFS asynchronously handle it.

So I'm actually wondering (but haven't looked yet) if the bug is in sqlx which is hit from the atuin history start ... command in _atuin_preexec.

Mic92 commented 3 months ago

I don't think this is specific to atuin at all. This pretty much normal sqlite behaviour, when doing a transaction. I tracked it down to the sqlite source code. ZFS just doesn't perform well with sqlite from the looks of it.

kbknapp commented 3 months ago

Yeah looking at the sqlite source it appears the WAL can be set to "truncate on commit", which issues ftruncate followed by osFcntl(walfd, F_FULLSYNC). I'm assuming it means "write transaction commit" which is what atuin history start is doing. I'm not yet certain why ZFS acts differently in this case though (outside of that ZIL comment above).

kbknapp commented 3 months ago

A little more testing shows that there are "workarounds" atuin could do to not hit this issue. There is the patch listed by @Mic92 but I think that is a more drastic approach since it disables the WAL and such globally.

If atuin changes atuin history start ... to not use a DB transaction and instead just a bare insert query it does not trigger this issue because sqlite does not issue a ftruncate on the WAL file.

Personally, I think this sounds pretty acceptable even to those not on ZFS/affected by this issue because the actual DB write transaction is a single insert query, so I don't think there is actually anything gained by using a full transaction in this case anyways. I've been running with this match locally for a few days now and haven't noticed any adverse effects. However, I haven't proposed it as a PR because it'd require some changes to the function type sigantures since currently the functions assume they're receiving a transaction in all cases.

arcuru commented 3 months ago

This shouldn't be an issue after the daemon feature is released, so that all the DB writes are handled off of the critical path. #2006

kmicklas commented 3 months ago

@kbknapp Even if your patch isn't yet PR-ready, would you mind posting it for others to try?

kbknapp commented 3 months ago

The main diff is in atuin-client/src/database.rs:

-    async fn save_raw(tx: &mut sqlx::Transaction<'_, sqlx::Sqlite>, h: &History) -> Result<()> {
+    async fn save_raw(tx: &mut sqlx::SqliteConnection, h: &History) -> Result<()> {

Posting the full diff wouldn't help much as I just did the quick and dirty to make this work as an experiment.

ellie commented 3 months ago

I'm not sure why your change might have helped @kbknapp. Afaik, SQLite will automatically begin and commit a transaction if you don't do it yourself. So while you may not be passing around transactions, they will still be there

As @arcuru said, the daemon changes that are currently in main totally remove the database from the hot path

See this for more: https://forum.atuin.sh/t/weekly-release-2024-19/317

kbknapp commented 3 months ago

I'm not sure why your change might have helped @kbknapp. Afaik, SQLite will automatically begin and commit a transaction if you don't do it yourself. So while you may not be passing around transactions, they will still be there

I couldn't say as I haven't dug through the sqlite source more and a quick scan. All can I say is not explicitly doing a transaction seems to omit the ftruncate on the WAL file (or at least most often so?). So I can't exactly say where in the pipeline of atuin->sqlx->libsqlite3 the meaningful change is taking place 🤷🏻‍♂️

I'm happy to wait for the daemon feature to merge though; that's exciting news. For this issue I was only posting thoughts here in case people were interested in trying to track this down further for reasons.

Also @ellie, while you're on the thread; thanks for Atuin! I absolutely love it, and it's a game changer I'd have a hard time parting with 😄

ellie commented 3 months ago

Also @ellie, while you're on the thread; thanks for Atuin! I absolutely love it, and it's a game changer I'd have a hard time parting with 😄

Thank you! Always nice to hear that <3

I couldn't say as I haven't dug through the sqlite source more and a quick scan.

The docs detail it pretty nicely: https://www.sqlite.org/lang_transaction.html

"Any command that accesses the database (basically, any SQL command, except a few PRAGMA statements) will automatically start a transaction if one is not already in effect. Automatically started transactions are committed when the last SQL statement finishes."

Anyway, there is now a workaround on main (and now in v18.3.0-prerelease.1) that should solve this for everyone. We've introduced Atuin daemon which makes this problem a non-issue. Essentially, we take all of the writes off of the hot path, and make sure that they happen in the background. The only remaining writes are to a unix socket, which should be fast regardless of fs.

You can read more here: https://forum.atuin.sh/t/moving-atuin-to-a-daemon/78

If you'd like to try it, please

  1. Run v18.3.0-prerelease.1. If you're not comfortable doing this, wait until the next stable release :)
  2. Add
[daemon]
enabled = true

to your config file

  1. Run atuin daemon somewhere. You could add a systemd unit file or similar. In the future we will handle this for you, but for now it's up to the user. You can read some more about it here
timlinux commented 2 months ago

Just a note for those attempting the workaround (using a zfs dataset with sync disabled) while waiting for the daemon solution to go into the general release: The workaround does not work for me:

zfs list -o name,sync | grep disabled
NIXROOT/atuin        disabled
zfs list
NAME                  USED  AVAIL  REFER  MOUNTPOINT
NIXROOT               783G   132G   192K  none
NIXROOT/atuin         192K  49.8M   192K  /home/timlinux/.local/share/atuin
NIXROOT/home          674G   132G   566G  legacy
NIXROOT/reserved        1G   133G   192K  none
NIXROOT/root          107G   132G   107G  legacy

I still regularly get errors like this:

Error: pool timed out while waiting for an open connection

Location:
    /build/source/atuin-client/src/record/sqlite_store.rs:48:20
ellie commented 2 months ago

@timlinux does this work for you?

https://github.com/atuinsh/atuin/issues/952#issuecomment-1902164562

laurentlbm commented 2 months ago

@ellie I'm not @timlinux, but I've been using the workaround from https://github.com/atuinsh/atuin/issues/952#issuecomment-1902164562 for a few months and it's been working great.

timlinux commented 2 months ago

@timlinux does this work for you?

#952 (comment)

Thank you so much @ellie, I have applied the changes you reference and am testing - so far it seems to resolve the issue. My tweaked version of the setup guide:

#!/usr/bin/env bash

# See https://github.com/atuinsh/atuin/issues/952#issuecomment-1902164562
# and
# https://github.com/atuinsh/atuin/issues/952#issuecomment-2141371564

sudo zfs create -V 50MB NIXROOT/atuin
sudo zfs list -o name,encryption | grep atuin # ensure encryption for the zvol
nix-shell -p e2fsprogs.bin --run 'sudo mkfs.ext4 /dev/zvol/NIXROOT/atuin'
mv ~/.local/share/atuin ~/.local/share/atuin-backup
mkdir ~/.local/share/atuin
sudo mount /dev/zvol/NIXROOT/atuin ~/.local/share/atuin
sudo chown -R ${USER}: ~/.local/share/atuin
cp -rv ~/.local/share/atuin-backup/* ~/.local/share/atuin 
boozedog commented 2 months ago

Excited for the new daemon so I can start using Atuin again! :+1:

ellie commented 2 months ago

Excited for the new daemon so I can start using Atuin again! 👍

It's out now as part of v18.3.0! Still experimental, but safe to use

Docs: https://docs.atuin.sh/reference/daemon/

hvisage commented 1 month ago

hmm... my personal concern with this, is that, on example multiuser and servers (Hypervisors) this daemon is adding extra stuff to run, and can go wrong, going a tad against the K.I.S.S. we have with atuin per se with the "builtin sqlite".

ZFS is a "Standard" in the ProxMox environments I deploy, so it does become a, challenge with the daemon popping up everywhere now - even more so in air gapped environments that I can't install a .deb anymore.

The separate Zvol would/could make sense... to a point except it's a single user work around, how to make it multi user? but then I'm hit with my guests on LXC containers, on ZFS, that is difficult to impractical to deploy at scale-ish.

If there is a way to force a PRAGMA=asyncronous/unsafe type to not have the synchronous writes (and here I will content that shell history is an assistance, not a requirement, besides the idea for me is much more to get the sync to a server going, there it'll make sense/simple to have ext4 on it)

Just my comments on this issue