hasse69 / rar2fs

FUSE file system for reading RAR archives
https://hasse69.github.io/rar2fs/
GNU General Public License v3.0
275 stars 27 forks source link

Random mounts dropping - Transport endpoint is not connected #171

Closed zeeter82 closed 2 years ago

zeeter82 commented 2 years ago

I'm having random mounts drop on me and I'm not sure why:

running df -h shows this: df: /home/xyz123/plex_mounts/fs02-rar2fs-x264-1080p: Transport endpoint is not connected

This is the second time this has happened in the last couple days and I've done nothing to change configuration. I can resolve it by unmounting share and mounting again. The file server (source of files) doesn't seem to be having any issues.

Any ideas?

karibertils commented 2 years ago

Crashed, when I try to access the mounted folder I get ls: cannot access '/media': Transport endpoint is not connected

But I don't know how to get the backtrace from gdb.

I started it with gdb --args /usr/local/bin/rar2fs --date-rar --no-expand-cbr --seek-length=1 -o noatime,allow_other,rw,umask=000,nonempty /stuff /media

In gdb

(gdb) run
Starting program: /usr/local/bin/rar2fs --date-rar --no-expand-cbr --seek-length=1 -o noatime,allow_other,rw,umask=000,nonempty /stuff /media
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Detaching after fork from child process 3003]
[Inferior 1 (process 2999) exited normally]
(gdb) bt
No stack.
(gdb) backtrace 
No stack.

Trying again now. I have added -f to the arguments so rar2fs does not detach.

karibertils commented 2 years ago

I have also been having another issue, where rar2fs freezes instead of crashing. df and ls commands hang.

That happened now. The last few lines in GDB were

rar2_release()   
(1206219) RELEASE [0x7fffc001ad40    ]
Closing file handle 0x7fffac02b7d0
(1206219) FREE [0x7fffc001ad40    ]
rar2_releasedir()   
rar2_write()   
[New Thread 0x7fff90ff9700 (LWP 2632934)]
Thread 4378 "rar2fs" received signal SIG32, Real-time event 32.
[Switching to Thread 0x7fff9bbfd700 (LWP 2632914)]
futex_abstimed_wait_cancelable (private=0, abstime=0x7fff9bbfced0, clockid=-1681928624, expected=0, futex_word=0x7fffe000dec4) at ../sysdeps/nptl/futex-internal.h:323
323     ../sysdeps/nptl/futex-internal.h: No such file or directory.
(gdb) 
(gdb) 

Not sure if it's related, should I open another case for that issue ?

hasse69 commented 2 years ago

The last issue should be reported separately. The current thread should focus on the crash, if there is such. Since you did not run rar2fs in the foreground gdb was not able to catch it at least.

karibertils commented 2 years ago

The last issue should be reported separately. The current thread should focus on the crash, if there is such. Since you did not run rar2fs in the foreground gdb was not able to catch it at least.

Ok, I will open a seperate issue.

Regarding the crash rar2fs has been running in foreground since the last one. And still waiting to catch the next one.

karibertils commented 2 years ago

Had a crash

rar2_read()   size=131072, offset=6803337216, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21100, offset=6803337216
size = 131072, chunk = 196657359
Read 131072 bytes from vol=34, base=1
rar2_read()   size=131072, offset=6803468288, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21101, offset=6803468288
size = 131072, chunk = 196526287
Read 131072 bytes from vol=34, base=1
rar2_opendir()   /folder1/folder2/folder3
rar2_opendir()   /folder1/folder2/folder3
rar2_opendir()   /folder1/folder2/folder3
rar2_read()   size=131072, offset=6803599360, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21102, offset=6803599360
size = 131072, chunk = 196395215
Read 131072 bytes from vol=34, base=1
rar2_opendir()   /folder1/folder2/folder3
rar2_read()   size=131072, offset=6803730432, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21103, offset=6803730432
size = 131072, chunk = 196264143
Read 131072 bytes from vol=34, base=1
rar2_read()   size=131072, offset=6803861504, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21104, offset=6803861504
size = 131072, chunk = 196133071
Read 131072 bytes from vol=34, base=1
rar2_read()   size=131072, offset=6803992576, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21105, offset=6803992576
size = 131072, chunk = 196001999
Read 131072 bytes from vol=34, base=1
rar2_read()   size=131072, offset=6804123648, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21106, offset=6804123648
size = 131072, chunk = 195870927
Read 131072 bytes from vol=34, base=1
rar2_readdir()   
rar2_read()   size=131072, offset=6804254720, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21107, offset=6804254720
size = 131072, chunk = 195739855
Read 131072 bytes from vol=34, base=1
rar2_opendir()   /folder1/folder2/folder3
rar2_read()   size=131072, offset=6804385792, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21108, offset=6804385792
size = 131072, chunk = 195608783
Read 131072 bytes from vol=34, base=1
readdir_scan()   /folder1/folder2/folder3
scandir: No such file or directory
rar2_opendir()   /folder1/folder2/folder3
rar2_read()   size=131072, offset=6804516864, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21109, offset=6804516864
size = 131072, chunk = 195477711
Read 131072 bytes from vol=34, base=1
rar2_read()   size=131072, offset=6804647936, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21110, offset=6804647936
size = 131072, chunk = 195346639
Read 131072 bytes from vol=34, base=1
rar2_read()   size=131072, offset=6804779008, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21111, offset=6804779008
size = 131072, chunk = 195215567
Read 131072 bytes from vol=34, base=1
rar2_read()   size=131072, offset=6804910080, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21112, offset=6804910080
size = 131072, chunk = 195084495
Read 131072 bytes from vol=34, base=1
rar2_read()   size=131072, offset=6805041152, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21113, offset=6805041152
size = 131072, chunk = 194953423
Read 131072 bytes from vol=34, base=1
rar2_read()   size=131072, offset=6805172224, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21114, offset=6805172224
size = 131072, chunk = 194822351
Read 131072 bytes from vol=34, base=1
rar2_read()   size=131072, offset=6805303296, fh=140737019132464
PID 2970512 calling lread_raw(), seq = 21115, offset=6805303296
size = 131072, chunk = 194691279
Read 131072 bytes from vol=34, base=1

Thread 597 "rar2fs.debug" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff9ffff700 (LWP 2814925)]
__GI___libc_free (mem=0x532e726574786544) at malloc.c:3102
3102    malloc.c: No such file or directory.
(gdb) bt
#0  __GI___libc_free (mem=0x532e726574786544) at malloc.c:3102
#1  0x0000555555571579 in __resolve_dir (dir=0x7fffe4018740 "/folder1/folder2/folder3", root=0x7fff9fffeae0 "/media/folder1/folder2/folder3", next=0x7fff9fffeb68, next2=0x7fff9fffeb60, f_ops=0x7fff9fffeaa0) at rar2fs.c:3019
#2  0x000055555557171f in readdir_scan (dir=0x7fffe4018740 "/folder1/folder2/folder3", root=0x7fff9fffeae0 "/media/folder1/folder2/folder3", next=0x7fff9fffeb68, next2=0x7fff9fffeb60) at rar2fs.c:3084
#3  0x000055555557277f in rar2_readdir (path=0x7fffe4018740 "/folder1/folder2/folder3", buffer=0x7fffe4101f90, filler=0x7ffff7f8aac0, offset=0, fi=0x7fff9fffec30) at rar2fs.c:3487
#4  0x00007ffff7f8fb29 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#5  0x00007ffff7f97252 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#6  0x00007ffff7f98350 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#7  0x00007ffff7f95039 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#8  0x00007ffff7d7fea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#9  0x00007ffff7cafdef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) c
Continuing.
rar2_statfs()   /
Caught signal SIGSEGV
rar2fs[2970512]: Got signal 11, faulty address is (nil), from 0x555555567cd7
rar2fs[2970512]: /usr/local/bin/rar2fs.debug(+0x13cd7) [0x555555567cd7]
rar2fs[2970512]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7ffff7d8b140]
rar2fs[2970512]: /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7ffff7c3c73c]
rar2fs[2970512]: /usr/local/bin/rar2fs.debug(+0x1d579) [0x555555571579]
rar2fs[2970512]: /usr/local/bin/rar2fs.debug(+0x1d71f) [0x55555557171f]
rar2fs[2970512]: /usr/local/bin/rar2fs.debug(+0x1e77f) [0x55555557277f]
rar2fs[2970512]: /lib/x86_64-linux-gnu/libfuse.so.2(+0xfb29) [0x7ffff7f8fb29]
rar2fs[2970512]: /lib/x86_64-linux-gnu/libfuse.so.2(+0x17252) [0x7ffff7f97252]
rar2fs[2970512]: /lib/x86_64-linux-gnu/libfuse.so.2(+0x18350) [0x7ffff7f98350]
rar2fs[2970512]: /lib/x86_64-linux-gnu/libfuse.so.2(+0x15039) [0x7ffff7f95039]
rar2fs[2970512]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8ea7) [0x7ffff7d7fea7]
rar2fs[2970512]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7ffff7cafdef]

Thread 597 "rar2fs.debug" received signal SIGSEGV, Segmentation fault.
raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) c
Continuing.
rar2_statfs()   /
rar2_flush()   
(2970512) FLUSH [0x5             ][called from 2903696]
lflush()   
rar2_flush()   
(2970512) FLUSH [0x5             ][called from 2833924]
lflush()   
rar2_flush()   
(2970512) FLUSH [0x7fffa01e6ec0  ][called from 2818836]
lflush()   
[Thread 0x7fffb4cc4700 (LWP 2818159) exited]
[Thread 0x7fff9effd700 (LWP 2817558) exited]
[Thread 0x7fff9ffff700 (LWP 2814925) exited]
[Thread 0x7ffff6a01700 (LWP 2812959) exited]
[Thread 0x7fffd4bf7700 (LWP 2812958) exited]
[Thread 0x7fffb6ffd700 (LWP 2812952) exited]
[Thread 0x7fffd57fa700 (LWP 2812833) exited]
[Thread 0x7fffd6ffd700 (LWP 2812829) exited]
[Thread 0x7fffd67fc700 (LWP 2811117) exited]
[Thread 0x7fffb77fe700 (LWP 2770098) exited]
[Thread 0x7ffff7a0d700 (LWP 2970547) exited]

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
hasse69 commented 2 years ago

Thanks, no need to continue once you see the first SIGSEGV. So looks like an invalid free() is made.

hasse69 commented 2 years ago

You can try this patch to see if the problem is resolved. Looks like a race between a FUSE opendir() and a readdir() operation in which the opened directory disappears before readdir() is called with a stale handle.

diff --git a/src/rar2fs.c b/src/rar2fs.c
index 1d97d8a..fd2f744 100644
--- a/src/rar2fs.c
+++ b/src/rar2fs.c
@@ -2904,7 +2904,7 @@ static int __resolve_dir(const char *dir, const char *root,
                 struct dir_entry_list **next2,
                 struct filter_ops *f_ops)
 {
-        struct dirent **namelist;
+        struct dirent **namelist = NULL;
         unsigned int f;
         int error_tot = 0;
         int seek_len = 0;
@@ -3014,9 +3014,13 @@ next_entry:
                 }

 next_type:
-                for (i = 0; i < n; i++)
-                        free(namelist[i]);
-                free(namelist);
+                if (namelist) {
+                        for (i = 0; i < n; i++)
+                                free(namelist[i]);
+                        free(namelist);
+                        namelist = NULL;
+                }
+
                 if (ret)
                         break;
         }
karibertils commented 2 years ago

You can try this patch to see if the problem is resolved. Looks like a race between a FUSE opendir() and a readdir() operation in which the opened directory disappears before readdir() is called with a stale handle.

diff --git a/src/rar2fs.c b/src/rar2fs.c
index 1d97d8a..fd2f744 100644
--- a/src/rar2fs.c
+++ b/src/rar2fs.c
@@ -2904,7 +2904,7 @@ static int __resolve_dir(const char *dir, const char *root,
                 struct dir_entry_list **next2,
                 struct filter_ops *f_ops)
 {
-        struct dirent **namelist;
+        struct dirent **namelist = NULL;
         unsigned int f;
         int error_tot = 0;
         int seek_len = 0;
@@ -3014,9 +3014,13 @@ next_entry:
                 }

 next_type:
-                for (i = 0; i < n; i++)
-                        free(namelist[i]);
-                free(namelist);
+                if (namelist) {
+                        for (i = 0; i < n; i++)
+                                free(namelist[i]);
+                        free(namelist);
+                        namelist = NULL;
+                }
+
                 if (ret)
                         break;
         }

Thanks for quick response.

I am running the patched debug version through gdb now. Let's see what happens.

karibertils commented 2 years ago

I'm still unable to reproduce the issue. Looks like the patch fixed it.

hasse69 commented 2 years ago

Great, at least some good news then. The patch is safe and makes perfect sense and will be merged to master.

@zeeter82 have you had a chance to try the patch out? Would like to close this issue if possible.

zeeter82 commented 2 years ago

Great, at least some good news then. The patch is safe and makes perfect sense and will be merged to master.

@zeeter82 have you had a chance to try the patch out? Would like to close this issue if possible.

I haven't tried the patch yet. I'll try to work on this ASAP.

zeeter82 commented 2 years ago

@hasse69 so I've been running latest git + patch + unrarsrc-6.1.2 for a couple days now. It's been stable for me so far with no crashes.

karibertils commented 2 years ago

Also stable for me after 6 days with heavy load. By now I'm 99.99% certain the issue is resolved.

hasse69 commented 2 years ago

Patch has been merged to master.