linuxmint / timeshift

System restore tool for Linux. Creates filesystem snapshots using rsync+hardlinks, or BTRFS snapshots. Supports scheduled snapshots, multiple backup levels, and exclude filters. Snapshots can be restored while system is running or from Live CD/USB.
1.53k stars 75 forks source link

Timeshift Segmentation fault on restore. #278

Open matthew-henry opened 4 months ago

matthew-henry commented 4 months ago

Describe the bug Timeshift restore on GUI fails silently (BTRFS snapshot). Timeshift in CLI has a segmentation fault (not sure how to get more debugging info) (BTRFS snapshot).

Not 100% sure that this isn't me not having partitions and stuff looking right (I'm sorry if I am reporting user error). To Reproduce Steps to reproduce the behavior:

  1. Sudo timeshift --restore
  2. Select any snapshot
  3. Program exits with Segmentation Fault

Expected behavior Expected snapshot to be restored

Screenshots

Timeshift error

System:

[matt@Matt-Desktop ~]$ sudo btrfs subvolume list / ID 648 gen 136175 top level 5 path @ ID 990 gen 136170 top level 5 path swap ID 1034 gen 136170 top level 5 path timeshift-btrfs/snapshots/2024-02-25_14-55-27/@ ID 1040 gen 136170 top level 5 path timeshift-btrfs/snapshots/2024-02-25_15-18-47/@ ID 1042 gen 136170 top level 5 path timeshift-btrfs/snapshots/2024-02-25_15-24-05/@ ID 1670 gen 136175 top level 5 path timeshift-btrfs/snapshots/2024-02-25_19-12-17/@ [matt@Matt-Desktop ~]$

[matt@Matt-Desktop ~]$ sudo btrfs subvolume list /home ID 256 gen 64430 top level 5 path matt/@

Found and attaching log file. 2024-02-25_21-43-17_restore.log

======================================================================= Stepping through things it looks like we go off rails right after init_mount_list() which looks to be reading from fstab. So including fstab output below.

UUID=1E2F-5409 /boot/efi vfat noatime 0 2 UUID=baf2eb9e-0edc-4fd9-86b3-8959f11bff33 / btrfs defaults,subvol=/@ 0 1 /swap/swapfile none swap defaults 0 0 /dev/sda3 /storagedrv ext4 defaults 0 0 /dev/nvme0n1p2 /home btrfs defaults 0 0 /dev/nvme1n1p1 /mnt/new_game_drive btrfs defaults 0 0

Looking at the log I am noticing the following line [21:43:24] missing: dev: UUID=baf2eb9e-0edc-4fd9-86b3-8959f11bff33, path: /, options: noatime

which does not match the options I have set for that volume in fstab( defaults,subvol=/@).

ILYAGVC commented 4 months ago

I have exactly the same problem.

matthew-henry commented 4 months ago

I have exactly the same problem.

Do you think you could grab a log and your fstab output as well. I was trying to go through and see if I could find where things break (never seen Vala before) and not having much luck. I'm curious if there's something similar about our configurations or if we are having similar related problems rather than identical.

ILYAGVC commented 4 months ago

I have exactly the same problem.

Do you think you could grab a log and your fstab output as well. I was trying to go through and see if I could find where things break (never seen Vala before) and not having much luck. I'm curious if there's something similar about our configurations or if we are having similar related problems rather than identical.

I couldn't wait to fix the problem so I used the GUI sorry

matthew-henry commented 4 months ago

I have exactly the same problem.

Do you think you could grab a log and your fstab output as well. I was trying to go through and see if I could find where things break (never seen Vala before) and not having much luck. I'm curious if there's something similar about our configurations or if we are having similar related problems rather than identical.

I couldn't wait to fix the problem so I used the GUI sorry

Eh no problem. GUI also doesn't work for me so it might be multiple things wrong. Not even sure if I can do something, but figured I should make an attempt if I could figure out what's happening (not having much luck or time to look into it). Thanks for at least confirming I'm not the only one impacted.

ILYAGVC commented 4 months ago

I have exactly the same problem.

Do you think you could grab a log and your fstab output as well. I was trying to go through and see if I could find where things break (never seen Vala before) and not having much luck. I'm curious if there's something similar about our configurations or if we are having similar related problems rather than identical.

I couldn't wait to fix the problem so I used the GUI sorry

Eh no problem. GUI also doesn't work for me so it might be multiple things wrong. Not even sure if I can do something, but figured I should make an attempt if I could figure out what's happening (not having much luck or time to look into it). Thanks for at least confirming I'm not the only one impacted.

You need to install GUI manually. Timeshift works well on Intel and AMD CPUs

matthew-henry commented 4 months ago

I have the gui installed. It'll also snapshot just fine from the GUI (I didn't try to use the CLI for restore until I had a broken desktop and had to work from TTY). But crashes in the same function as the CLI version if you click the restore button. I was trying to see if there might be an oddity with how my partition layout is made that caused the crashes. Checking the logs it looks like it doesn't figure out some information about my root partition when restoring.

[21:43:24] missing: dev: UUID=baf2eb9e-0edc-4fd9-86b3-8959f11bff33, path: /, options: noatime

The last log entry before crash (CLI or GUI) is always get_restore_messages() from called from Main.

I was trying to trace through the code but I don't see what could be happening. Might need to try building with debug flags and tracing with gdb when I have time.

matthew-henry commented 4 months ago

Built with debug symbols and ran through gdb. The program crashes in restore_current_system on line 1895. image

Stepping through with gdb right before the seg fault get_dst_root is invoked and then falls through to returning null. Even still I'm not 100% certain why a seg fault happens in the line in question (unless trying to access a field of a null object causes a segfault). If returning null in these is a cause for potential segfaults it should probably be exception handled (to be clear I'm not sure that that IS the cause).

I'm assuming that the root cause is probably something about my partition setup where my root doesn't seem to be parsed. Whether it's a configuration issue on my end or an edge case, there is some situation that leads to a seg fault that could use some catching.

ILYAGVC commented 4 months ago

Built with debug symbols and ran through gdb. The program crashes in restore_current_system on line 1895. image

Stepping through with gdb right before the seg fault get_dst_root is invoked and then falls through to returning null. Even still I'm not 100% certain why a seg fault happens in the line in question (unless trying to access a field of a null object causes a segfault). If returning null in these is a cause for potential segfaults it should probably be exception handled (to be clear I'm not sure that that IS the cause).

I'm assuming that the root cause is probably something about my partition setup where my root doesn't seem to be parsed. Whether it's a configuration issue on my end or an edge case, there is some situation that leads to a seg fault that could use some catching.

It's evident that Timeshift is attempting to access a non-existent resource. As previously mentioned, Timeshift performs smoothly on both AMD and Intel CPUs. You may need to await optimization from the developer to extend compatibility to other CPUs, particularly in command-line usage.

matthew-henry commented 4 months ago

This has nothing to do with CPU compatibility. I am running on an 7950x (an AMD x86 cpu). Even if I were on a different platform they aren't using CPU-specific low level stuff so any platform that Vala compiles to should work more or less. This has everything to do with unhandled conditions in the code (and in Main.vala so in my situation both GUI and CLI are impacted as both versions call in to Main.vala when performing restores) and with issues with the way in the formatting of my fstab (it fails to assign a device to main in the processing code as determined by logs and stepping through with gdb). The traces provided are necessary details for any developer not impacted by the issue to know where and how the crash is happening. Without logs and details on your issue we may very well have entirely separate crash causes and conditions.

I am trying to definitively identify a root-cause for when and if an developer does look at this (especially as I haven't provided definitive steps to reproduce yet they need some stuff to go on for it to be worth their time). And if it's something I think I can fix maybe do a pull request.

matthew-henry commented 4 months ago

I believe I have tracked down the reason my root is not found on the filesystem probe.

get_block_devices_using_lsblk in Utility/Devices.vala executes: lsblk --bytes --pairs --output NAME,KNAME,LABEL,UUID,TYPE,FSTYPE,SIZE,MOUNTPOINT,MODEL,RO,HOTPLUG,MAJ:MIN,PARTLABEL,PARTUUID,PKNAME,VENDOR,SERIAL,REV

image

Running this command, however, will not find a root partition on my system.

image

Looking at a basic lsblk:

image

There are two mountpoints listed for the partition containing root.

I'm assuming this logged missing root in the logs might follow from this point.

image

Per manpages for lsblk

image so in situations like mine the MOUNTPOINTS column would be needed and output parsed appropriately.

khaliid2040 commented 4 weeks ago

still the issue is present in version 24.06.1-1 Steps to reproduce just like the original post

  1. timeshift --restore
  2. select the snapshot and click enter in the grub whether you select yes or no the segfault will happen Expected behaiviour the bug is fixed in the privious versions and the reported