Open Meister1593 opened 2 years ago
This is a fresh installation?
I'd try deleting the "local_db" file in the /var/lib/refind-btrfs directory, deleting created snapshots in /root/.refind-btrfs (btrfs subvolume delete
This is a fresh installation? I'd try deleting the "local_db" file in the /var/lib/refind-btrfs directory and running the tool manually again.
Yes, it's fresh installation Removing local db and files in /root/.refind-btrfs didnt help, same error /root/.refind-btrfs isnt a subvolume, its just a folder My real snapshots are stored in /.snapshots subvol
Generated writable snapshots are stored in the /root/.refind-btrfs directory because you've set the "modify_read_only_flag" option to false (default value).
Can you print the contents of the btrfs-snapshot-stanzas/root_vmlinuz-linux.conf file? It's a directory relative to rEFInd's root directory.
I think this has something to do with having a separate /boot partition - the logic of matching boot stanzas with snapshots and the root device is different in this case. Recent changes might have broken it, somehow.
I'll have to try and reproduce it locally, later today. The absence of unit tests is starting to become a real nuisance...
Generated writable snapshots are stored in the /root/.refind-btrfs directory because you've set the "modify_read_only_flag" option to false (default value). Can you print the contents of the btrfs-snapshot-stanzas/root_vmlinuz-linux.conf file? It's a directory relative to rEFInd's root directory.
I think this has something to do with having a separate /boot partition - the logic of matching boot stanzas with snapshots and the root device is different in this case. Recent changes might have broken it, somehow.
I'll have to try and reproduce it locally, later today. The absence of unit tests is starting to become a real nuisance...
Here: root_vmlinuz-linux.conf
Yes, i do have separate /boot partition (for luks2 to work properly) this is my layout:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 931.5G 0 disk
├─sda1 8:1 0 512M 0 part /boot
└─sda2 8:2 0 931G 0 part
└─crypt 254:0 0 931G 0 crypt /var/tmp
/var/cache/pacman/pkg
/srv
/home
/btrfs
/.snapshots
/.swapvol
/
I think that this line might have something to do with the issue you're experiencing.
Would it be possible for you to comment it out (or delete it altogether) and try running the refind-btrfs script? It's just so I can narrow down the cause of the problem. I'm not entirely sure if that's going to work on its own though, you should probably also delete these two files:
I'd also stop the systemd service although it probably just exits on its own (unexpected exception).
I deleted these two python files, stopped service and commented these lines. After running refind-btrfs, it didnt fail and seems to work fine
Thank you, that's really helpful information. Fixing it should be easier now.
You could try running the service now and seeing whether or not it works after a snapshot is automatically taken by Snapper.
Doesn't seem like its generating new stanza after creating snapshot... (the service)
Hm, an error should be visible in the journal via journalctl -u refind-btrs -b
.
Did you create a snapshot manually by running snapper -c root create -c number
?
Hm, an error should be visible in the journal via
journalctl -u refind-btrs -b
. Did you create a snapshot manually by runningsnapper -c root create -c number
?
No errors
I did, yes snapper create -t single
No errors I did, yes
snapper create -t single
And a new snapshot was created under /.snapshots? I don't understand why there's no output in the journal at all.
No errors I did, yes
snapper create -t single
And a new snapshot was created under /.snapshots? I don't understand why there's no output in the journal at all.
Yes, they are created in that folder and visible by snapper list
Pre-post snapshots doesnt seem to be triggering service too
What exactly does journalctl -u refind-btrs -b
print now? Same as in your original post?
What exactly does
journalctl -u refind-btrs -b
print now? Same as in your original post?
-- Journal begins at Tue 2021-08-10 23:52:30 +04, ends at Thu 2021-08-12 15:17:51 +04. --
Aug 12 10:40:55 plyshkahost systemd[1]: Starting Generate rEFInd manual boot stanzas from Btrfs snapshots...
Aug 12 10:40:56 plyshkahost /usr/lib/python3.9/site-packages/refind_btrfs/__main__.py[1068]: Scheduling watch for directories: /.snapshots, /.snapshots/1, /.snapshots/10, /.snapshots/11, /.snapshots/12, /.snapshots/13, /.snapshots/14, /.snapshots/15, /.snapshots/16, /.snapshots/17, /.snapshots/18, /.snapshots/19, /.snapshots/2, /.snapshots/20, /.snapshots/21, /.snapshots/22, /.snapshots/23, /.snapshots/24, /.snapshots/25, /.snapshots/26, /.snapshots/27, /.snapshots/28, /.snapshots/29, /.snapshots/3, /.snapshots/30, /.snapshots/31, /.snapshots/32, /.snapshots/33, /.snapshots/34, /.snapshots/35, /.snapshots/36, /.snapshots/37, /.snapshots/38, /.snapshots/39, /.snapshots/4, /.snapshots/40, /.snapshots/41, /.snapshots/5, /.snapshots/6, /.snapshots/7, /.snapshots/8, /.snapshots/9.
Aug 12 10:40:56 plyshkahost /usr/lib/python3.9/site-packages/refind_btrfs/__main__.py[1068]: Starting the observer with PID 1068.
Aug 12 10:40:56 plyshkahost systemd[1]: Started Generate rEFInd manual boot stanzas from Btrfs snapshots.
Aug 12 14:45:58 plyshkahost systemd[1]: Stopping Generate rEFInd manual boot stanzas from Btrfs snapshots...
Aug 12 14:45:58 plyshkahost /usr/lib/python3.9/site-packages/refind_btrfs/__main__.py[1068]: Received terminating signal 15, stopping the observer...
Aug 12 14:45:59 plyshkahost systemd[1]: refind-btrfs.service: Deactivated successfully.
Aug 12 14:45:59 plyshkahost systemd[1]: Stopped Generate rEFInd manual boot stanzas from Btrfs snapshots.
Aug 12 14:45:59 plyshkahost systemd[1]: refind-btrfs.service: Consumed 1.861s CPU time.
Aug 12 14:54:40 plyshkahost systemd[1]: Starting Generate rEFInd manual boot stanzas from Btrfs snapshots...
Aug 12 14:54:40 plyshkahost /usr/lib/python3.9/site-packages/refind_btrfs/__main__.py[19295]: Scheduling watch for directories: /.snapshots, /.snapshots/1, /.snapshots/10, /.snapshots/11, /.snapshots/12, /.snapshots/13, /.snapshots/14, /.snapshots/15, /.snapshots/16, /.snapshots/17, /.snapshots/18, /.snapshots/19, /.snapshots/2, /.snapshots/20, /.snapshots/21, /.snapshots/22, /.snapshots/23, /.snapshots/24, /.snapshots/25, /.snapshots/26, /.snapshots/27, /.snapshots/28, /.snapshots/29, /.snapshots/3, /.snapshots/30, /.snapshots/31, /.snapshots/32, /.snapshots/33, /.snapshots/34, /.snapshots/35, /.snapshots/36, /.snapshots/37, /.snapshots/38, /.snapshots/39, /.snapshots/4, /.snapshots/40, /.snapshots/41, /.snapshots/42, /.snapshots/43, /.snapshots/44, /.snapshots/45, /.snapshots/46, /.snapshots/47, /.snapshots/48, /.snapshots/5, /.snapshots/6, /.snapshots/7, /.snapshots/8, /.snapshots/9.
Aug 12 14:54:40 plyshkahost /usr/lib/python3.9/site-packages/refind_btrfs/__main__.py[19295]: Starting the observer with PID 19295.
Aug 12 14:54:40 plyshkahost systemd[1]: Started Generate rEFInd manual boot stanzas from Btrfs snapshots.
Aug 12 15:00:30 plyshkahost systemd[1]: Stopping Generate rEFInd manual boot stanzas from Btrfs snapshots...
Aug 12 15:00:30 plyshkahost /usr/lib/python3.9/site-packages/refind_btrfs/__main__.py[19295]: Received terminating signal 15, stopping the observer...
Aug 12 15:00:31 plyshkahost systemd[1]: refind-btrfs.service: Deactivated successfully.
Aug 12 15:00:31 plyshkahost systemd[1]: Stopped Generate rEFInd manual boot stanzas from Btrfs snapshots.
Aug 12 15:04:07 plyshkahost systemd[1]: Starting Generate rEFInd manual boot stanzas from Btrfs snapshots...
Aug 12 15:04:07 plyshkahost /usr/lib/python3.9/site-packages/refind_btrfs/__main__.py[20301]: Scheduling watch for directories: /.snapshots, /.snapshots/1, /.snapshots/10, /.snapshots/11, /.snapshots/12, /.snapshots/13, /.snapshots/14, /.snapshots/15, /.snapshots/16, /.snapshots/17, /.snapshots/18, /.snapshots/19, /.snapshots/2, /.snapshots/20, /.snapshots/21, /.snapshots/22, /.snapshots/23, /.snapshots/24, /.snapshots/25, /.snapshots/26, /.snapshots/27, /.snapshots/28, /.snapshots/29, /.snapshots/3, /.snapshots/30, /.snapshots/31, /.snapshots/32, /.snapshots/33, /.snapshots/34, /.snapshots/35, /.snapshots/36, /.snapshots/37, /.snapshots/38, /.snapshots/39, /.snapshots/4, /.snapshots/40, /.snapshots/41, /.snapshots/42, /.snapshots/43, /.snapshots/44, /.snapshots/45, /.snapshots/46, /.snapshots/47, /.snapshots/48, /.snapshots/49, /.snapshots/5, /.snapshots/6, /.snapshots/7, /.snapshots/8, /.snapshots/9.
Aug 12 15:04:07 plyshkahost /usr/lib/python3.9/site-packages/refind_btrfs/__main__.py[20301]: Starting the observer with PID 20301.
Aug 12 15:04:07 plyshkahost systemd[1]: Started Generate rEFInd manual boot stanzas from Btrfs snapshots.
After last entry, I did create snapshots 2-3 times
If it does work it should output various information to the journal, not just that it started.
Btw, I also noticed on my own machine that the filesystem events are not raised and/or caught by (not sure) the Watchdog library which I'm using for this feature. It happens exactly at midnight but not every day, pretty weird.
The official package repo now has a pretty outdated version of this library. I'm the one who flagged it, if I remember correctly.
If it does work it should output various information to the journal, not just that it started.
Btw, I also noticed on my own machine that the filesystem events are not raised and/or caught by (not sure) the Watchdog library which I'm using for this feature. It happens exactly at midnight but not every day, pretty weird.
The official package repo now has a pretty outdated version of this library. I'm the one who flagged it, if I remember correctly.
Gonna install git version ASAP)
You mean this one? Hm, maybe that does the trick but I'm not too sure...
Got this error after making snapshot
Aug 12 15:49:54 plyshkahost systemd[1]: Starting Generate rEFInd manual boot stanzas from Btrfs snapshots...
Aug 12 15:49:54 plyshkahost /usr/lib/python3.9/site-packages/refind_btrfs/__main__.py[27571]: Scheduling watch for directories: /.snapshots, /.snapshots/1, /.snapshots/10, /.snapshots/11, /.snapshots/12, /.snapshots/13, /.snapshots/14, /.snapshots/15, /.snapshots/16, /.snapshots/17, /.snapshots/18, /.snapshots/19, /.snapshots/2, /.snapshots/20, /.snapshots/21, /.snapshots/22, /.snapshots/23, /.snapshots/24, /.snapshots/25, /.snapshots/26, /.snapshots/27, /.snapshots/28, /.snapshots/29, /.snapshots/3, /.snapshots/30, /.snapshots/31, /.snapshots/32, /.snapshots/33, /.snapshots/34, /.snapshots/35, /.snapshots/36, /.snapshots/37, /.snapshots/38, /.snapshots/39, /.snapshots/4, /.snapshots/40, /.snapshots/41, /.snapshots/42, /.snapshots/43, /.snapshots/44, /.snapshots/45, /.snapshots/46, /.snapshots/47, /.snapshots/48, /.snapshots/49, /.snapshots/5, /.snapshots/50, /.snapshots/51, /.snapshots/52, /.snapshots/53, /.snapshots/54, /.snapshots/6, /.snapshots/7, /.snapshots/8, /.snapshots/9.
Aug 12 15:49:54 plyshkahost /usr/lib/python3.9/site-packages/refind_btrfs/__main__.py[27571]: Starting the observer with PID 27571.
Aug 12 15:49:54 plyshkahost systemd[1]: Started Generate rEFInd manual boot stanzas from Btrfs snapshots.
Aug 12 15:50:50 plyshkahost /usr/lib/python3.9/site-packages/refind_btrfs/__main__.py[27571]: ERROR (refind_btrfs.service.snapshot_observer/snapshot_observer.py/run): An unexpected error happened, exiting...
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/refind_btrfs/service/snapshot_observer.py", line 45, in run
self.dispatch_events(self.event_queue, self.timeout)
File "/usr/lib/python3.9/site-packages/watchdog/observers/api.py", line 372, in dispatch_events
handler.dispatch(event)
File "/usr/lib/python3.9/site-packages/watchdog/events.py", line 282, in dispatch
self.event_dispatch_map[event.event_type](event)
AttributeError: 'SnapshotEventHandler' object has no attribute 'event_dispatch_map'
Aug 12 15:50:50 plyshkahost systemd[1]: refind-btrfs.service: Main process exited, code=exited, status=1/FAILURE
Aug 12 15:50:50 plyshkahost systemd[1]: refind-btrfs.service: Failed with result 'exit-code'.
That version is even more outdated, it seems (breaking API changes) but at least something happened, I guess.
So, whats next thing to do, how can I help?
Well, currently nothing. I'll try to fix the issue regarding the error you've seen (script's output) and then we'll just have to see when I make a new release.
I've never asked anyone to delete CPython's cache before (nor did I try that myself), there could be some residue left of which I'm not aware. It's best to overwrite everything with a new version (both source and bytecode files).
I'll also check the service's output on my own machine today even though it worked just fine yesterday.
I've found the culprit (I hope so), pushed the fix and created a new release. Please check it out when you can and report back here.
I've found the culprit (I hope so), pushed the fix and created a new release. Please check it out when you can and report back here.
I just installed new version, got warning about not found cpython file but after installation it got installed just fine
checked that file where i commented out code, it got uncommented (which indicates it was replaced by new version)
checked refind-btrfs
- no errors and it worked just fine
service, of course, still doesnt work
This is the output I get with a freshly installed version and a snapshot created at 20:00 PM (18:00 UTC):
I really have no idea why it doesn't work on your machine. What is so special about your setup, I wonder? You're using Snapper as am I but that shouldn't really matter.
Damn, I've just noticed that Watchdog currently has 142 open issues... I wish there was an alternative library, at this point.
I use luks2, i have two additional read-only disks (old disks that i currently use as migration storage to new), they are luks2 too both and one of them have 2 efistubs on fat32 boot partition, which i can boot into. I mostly followed this guide for setting up btrfs on arch, since i barely knew about btrfs and wanted some basic guide to make system with btrfs (i did make other luks setup mostly by myself)
The same author has another guide, as well. I don't know why he felt the need to have more than one.
He also had a couple of issues with refind-btrfs which were fixed a few months ago.
I mean it's entirely possible to use inotify directly and run the script when needed but I didn't want to go with that option because Watchdog already utilizes it on Linux and it's been working more or less fine for me and other users, it seems.
You are the first person who's experiencing these issues, unfortunately, so I'm somewhat in the dark here.
Hm, I've found this interesting piece of information. There is apparently a maximum amount of watchers allowed by inotify. Perhaps you could try increasing it and see what happens then? Mine is set at 524288, try setting it to 1048576 and rebooting the system.
You can check the current value with cat /proc/sys/fs/inotify/max_user_watches
.
EDIT: This setting can be changed temporarily, as well (the first link).
Hm, I've found this interesting piece of information. There is apparently a maximum amount of watchers allowed by inotify. Perhaps you could try increasing it and see what happens then? Mine is set at 524288, try setting it to 1048576 and rebooting the system. You can check the current value with
cat /proc/sys/fs/inotify/max_user_watches
.EDIT: This setting can be changed temporarily, as well (the first link).
I increased that watchdog limit via sysctl and it didn't work either, same 4 lines in status...
As of now i decided to make at least temporary hack to run refind-btrfs on startup and shutdown:
refind-btrfs-create.service
[Unit]
Description=Populate snapshots on startup and shutdown
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/refind-btrfs
ExecStop=/usr/bin/refind-btrfs
[Install]
WantedBy=multi-user.target
Pretty weird, this issue definitely requires further investigation.
I have the same problem, increase the inotifywait, and check the ones in use. 1701 WATCHES TOTAL COUNT https://github.com/fatso83/dotfiles/blob/master/utils/scripts/inotify-consumers installation similar to Th3Whit3Wolf
ERROR (refind_btrfs/__init__.py/main): An unexpected error happened, exiting...
Traceback (most recent call last):
File "/usr/lib/python3.10/site-packages/refind_btrfs/__init__.py", line 80, in main
exit_code = runner.run()
File "/usr/lib/python3.10/site-packages/refind_btrfs/service/watchdog_runner.py", line 93, in run
observer.start()
File "/usr/lib/python3.10/site-packages/watchdog/observers/api.py", line 259, in start
emitter.start()
File "/usr/lib/python3.10/site-packages/watchdog/utils/__init__.py", line 116, in start
self.on_thread_start()
File "/usr/lib/python3.10/site-packages/watchdog/observers/inotify.py", line 120, in on_thread_start
self._inotify = InotifyBuffer(path, self.watch.is_recursive)
File "/usr/lib/python3.10/site-packages/watchdog/observers/inotify_buffer.py", line 35, in __init__
self._inotify = Inotify(path, recursive)
File "/usr/lib/python3.10/site-packages/watchdog/observers/inotify_c.py", line 189, in __init__
Inotify._raise_error()
File "/usr/lib/python3.10/site-packages/watchdog/observers/inotify_c.py", line 426, in _raise_error
raise OSError(errno.EMFILE, "inotify instance limit reached")
OSError: [Errno 24] inotify instance limit reached
I have the same problem, increase the inotifywait, and check the ones in use. 1701 WATCHES TOTAL COUNT https://github.com/fatso83/dotfiles/blob/master/utils/scripts/inotify-consumers installation similar to Th3Whit3Wolf
ERROR (refind_btrfs/__init__.py/main): An unexpected error happened, exiting... Traceback (most recent call last): File "/usr/lib/python3.10/site-packages/refind_btrfs/__init__.py", line 80, in main exit_code = runner.run() File "/usr/lib/python3.10/site-packages/refind_btrfs/service/watchdog_runner.py", line 93, in run observer.start() File "/usr/lib/python3.10/site-packages/watchdog/observers/api.py", line 259, in start emitter.start() File "/usr/lib/python3.10/site-packages/watchdog/utils/__init__.py", line 116, in start self.on_thread_start() File "/usr/lib/python3.10/site-packages/watchdog/observers/inotify.py", line 120, in on_thread_start self._inotify = InotifyBuffer(path, self.watch.is_recursive) File "/usr/lib/python3.10/site-packages/watchdog/observers/inotify_buffer.py", line 35, in __init__ self._inotify = Inotify(path, recursive) File "/usr/lib/python3.10/site-packages/watchdog/observers/inotify_c.py", line 189, in __init__ Inotify._raise_error() File "/usr/lib/python3.10/site-packages/watchdog/observers/inotify_c.py", line 426, in _raise_error raise OSError(errno.EMFILE, "inotify instance limit reached") OSError: [Errno 24] inotify instance limit reached
Interesting, it could be caused by either one of these:
Arch is still using the old 0.10.x watchdog's branch intended for what is now a prehistoric Python version instead of the newer 1.0.x branch intended for Python 3.6 and newer.
I've personally flagged the package as outdated, months ago.
Interesting, it could be caused by either one of these:
* a pretty huge number of snapshots
if I had too many snapshots, the cleanup service had stopped. delete the local_db, and the confs in refind now works
Because of this, i need to manually run
refind-btrfs
each time i want to populate refind with snapshots. Alljournalctl -u refind-btrs -b
says is this:And also when i run this, it gives me such error.
configs: refind-btrfs.conf refind.conf