hasse69 / rar2fs

FUSE file system for reading RAR archives
https://hasse69.github.io/rar2fs/
GNU General Public License v3.0
272 stars 25 forks source link

Creating sub-processes using 100% CPU when content is added #165

Closed philnse closed 2 years ago

philnse commented 3 years ago

After hours of research regarding my problem I desperatly end up posting here as I just can't figure out how to solve the issue. I'm using rar2fs with an instance of plex. For the last ten days, rar2fs was running as expected: One instance per mount. After adding two packed episodes overnight I ended up with a 100% CPU load for the mount that is handling the series section. This is an issue I had several times in the past. It seems to be the same issue adressed in in #11

If I remember correctly the suspected reason for it was the plex media server wanting to add files to the mounted but still rar'd files. I figure it has something to do with chapter images that are generated by plex using the transcoder, as no files are saved in the original content folder but the plex media server's library I'm having a hard time to reproduce the error. Could it be that some kind of cache is piled with data when generating those chapter images and rar2fs crashes? I've attatched an excerpt of my webmin to clarify:

Screenshot

Please let me know if you need any logs. I'm happy to provide them.

//EDIT: I forgot to mention what I tried to solve it by now: run as root run as regular user --seek-length=0, --seek-length=1 --no-smp Killing exceeding/crashed instances of rar2fs resulting in loosing the mounted content

Regards

My system is running:

Ubuntu 18.04.5 rar2fs v1.29.5-gita393a68 (DLL version 8) Copyright (C) 2009 Hans Beckerus FUSE library version: 2.9.7 fusermount version: 2.9.7 UNRAR 6.02 beta 1 freeware Plex Media Server Version 1.23.4.4712

hasse69 commented 3 years ago

Sorry to hear this old problem (if the same) seems to persist. Not heard any reports of this in a very long time.

What I need from you to even have the slightest chance to find the root cause is a gdb stack dump from a process that is stuck at 100% load. You would attach gdb to the pid and then dump the stack using the 'bt' command. You might need to rebuild rar2fs using '--enable-debug' in case the stack trace lacks any readable symbols and run the non-stripped binary from the 'src' directory. Thus not the one installed by 'make install'.

hasse69 commented 3 years ago

If you wish to test some other version I think you need to go back 4-5 years to v1.23.1. This is when pipes were replaced with conditional variables. It is the only thing I can think of that would perhaps cause a regression like this.

philnse commented 3 years ago

Thank you so much for the quick answer! As I'm having the issues after starting to update to newer versions of rar2fs as soon as possible I'll try your suggeestion of using the older version first. I'm kind of overwhelmed by your first comment as I'm not exactly sure what you're askin me to do! :)

I'll get back after some testing with the older version.

Thanks for this great software!

Regards

philnse commented 3 years ago

So I tried to install the version you mentioned and ended up with an error as in #85. After applying the patch and make gets to ./unrar I get this error:

Screen (Sorry for the Screenshot, couldn't handle to post it properly in code.)

I guess it has something to do with the unrar version? Tried 6.0.7 and 5.5.6 for compiling while 5.5.0 is installed on the system.

I'm happy to try the things you said to do with the most recent version of rar2fs, as it seems to have some fiddling involved anyway. If you just could clarifiy what to do exactly I'm happy to do so.

Regards

hasse69 commented 3 years ago

What version did you use before you hit this problem?

philnse commented 3 years ago

I was using v1.29.5.

hasse69 commented 3 years ago

But I was under the impression you hit this problem when moving to 1.29.5 so what version did you upgrade from?

philnse commented 3 years ago

I tried to downgrade from 1.29.5 to 1.23.1 and ended up with those errors.

So currently I'm still using 1.29.5 with unrar 5.50 because installing 1.23.1 didn't work. After remounting with 1.29.5 and unrar 5.50 everything is working right now. It's kind of hard for me to reproduce the CPU load failure as it could take days for it to occur again. Maybe you could explain what you meant with:

What I need from you to even have the slightest chance to find the root cause is a gdb stack dump from a process that is stuck at 100% load. You would attach gdb to the pid and then dump the stack using the 'bt' command. You might need to rebuild rar2fs using '--enable-debug' in case the stack trace lacks any readable symbols and run the non-stripped binary from the 'src' directory. Thus not the one installed by 'make install'.

I'm happy to do this as soon as the CPU load issue comes up again! I could try to figure it out by myself with stackoverflow, but that could take ages! ;)

hasse69 commented 3 years ago

Ok my only comment was that when did you start seeing this issue? You moved to 1.29.5 at some point right? So before that what version were you using since you did not report this problem until now?

I will get back to you about details on how to attach gdb to a running process and dump the stack.

philnse commented 3 years ago

Ahh, sorry I didn't get that. I figured it might have something to do with my server config so I tried other stuff before. rar2fs worked flawlessly for years and wouldn't have thought it mighty break like this.

Sadly I don't know exatcly. I would guess at least since 1.29.0

hasse69 commented 3 years ago

Easiest way to attach gdb would be

gdb -p `pidof your_running_program_name`

However since rar2fs spawn many processes/threads the best way is to use the 'ps' or 'top' command to find the exact process id that is using a lot of CPU, and then

gdb -p <pid>

Once in gdb you will get a prompt 'gdb>' and and if input cannot be given do CTRL-C to interrupt it and then run the gdb 'bt' command.

In addition to this an strace might be useful. So in the same manner run the Linux 'strace' command and log output to some file

strace -p <pid> > strace.txt

Note that strace will not terminate since the process is more than likely stuck in a loop so just let strace run for a few seconds and then terminate it using CTRL-C.

Have you been able to reproduce the issue yet? If you manage to find an easy way to do that going back to some previous version like 1.29.4 and confirm it now works, if not step back a few more versions and so on. It would provide valuable input for me to understand what code delta we are dealing with here as well.

And sorry for the late reply, I have been rather busy at work recently and have not had a single moment of spare time.

philnse commented 3 years ago

Absolutely no problem! I would never expect quick answers anyway. It's free software you're maintaining in your free time. I totally understand it doesn't have highest priority.

I fell back one major version and now using version 1.28. The plan is to check if everything is working here and slowly work my way up to a version which is not. I have processes running since July 1st, added content and neither had sub-procceses nor high CPU load. So I guess I'll let it run for a week or until problems occur.

After that I'll use the commands you provided with a newer version which is causing problems.

Thanks again and have a nice weekend!

philnse commented 3 years ago

So it seems like the problem persists in v 1.28 aswell. I have one process with one sub-process so far (14075 and 14078) using 100% of CPU Here's the output of gdb:

Attaching to process 14075 [New LWP 14077] [New LWP 14078] [New LWP 14079] [New LWP 29427] [New LWP 2850] [New LWP 4000] [New LWP 7034] [New LWP 7037] [New LWP 7044] [New LWP 7049] [New LWP 7052] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f2af45627a0 in __GI___nanosleep (requested_time=requested_time@entry=0x7ffc0fc6c7a0, remaining=remaining@entry=0x7ffc0fc6c7a0) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 28 ../sysdeps/unix/sysv/linux/nanosleep.c: No such file or directory.

and the output of strace running a couple of seconds:

strace -p 14075 > strace.txt strace: Process 14075 attached restart_syscall(<... resuming interrupted nanosleep ...>) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc0fc6c7a0) = 0 nanosleep({tv_sec=1, tv_nsec=0}, ^Cstrace: Process 14075 detached <detached ...>

I guess it means I have to recompile using --enable-debug?

hasse69 commented 3 years ago

Yes I think you should recompile using --enable-debug. But gdb output looks a bit strange. Did you really interrupt using CTRL-C? I don't see the output from bt command.

philnse commented 2 years ago

Here's the output using the bt command. Not allot more information given. 'Top' now shows a cpu load of 250% for this process because of all of these sub processes. I'll recompile later with --enable-debug. Could you explain how to to that when not using 'make install'?

Attaching to process 14049
[New LWP 14050]
[New LWP 14051]
[New LWP 14052]
[New LWP 29414]
[New LWP 29804]
[New LWP 29832]
[New LWP 29883]
[New LWP 30602]
[New LWP 30623]
[New LWP 30630]
[New LWP 6463]
[New LWP 1838]
[New LWP 20226]
[New LWP 5452]
[New LWP 17786]
[New LWP 23038]
[New LWP 28384]
[New LWP 28518]
[New LWP 1268]
[New LWP 1270]
[New LWP 1272]
[New LWP 1274]
[New LWP 1276]
[New LWP 1278]
[New LWP 1280]
[New LWP 1282]
[New LWP 1284]
[New LWP 1286]
[New LWP 1288]
[New LWP 1290]
[New LWP 1292]
[New LWP 1294]
[New LWP 1296]
[New LWP 1298]
[New LWP 1300]
[New LWP 1302]
[New LWP 1304]
[New LWP 1307]
[New LWP 1309]
[New LWP 1313]
[New LWP 1315]
[New LWP 1318]
[New LWP 1330]
[New LWP 1332]
[New LWP 1335]
[New LWP 1338]
[New LWP 1340]
[New LWP 1342]
[New LWP 1344]
[New LWP 1346]
[New LWP 1351]
[New LWP 1354]
[New LWP 1357]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f6dde7247a0 in __GI___nanosleep (requested_time=requested_time@entry=0x7ffd4d654160, remaining=remaining@entry=0x7ffd4d654160) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
28      ../sysdeps/unix/sysv/linux/nanosleep.c: No such file or directory.
(gdb) bt
#0  0x00007f6dde7247a0 in __GI___nanosleep (requested_time=requested_time@entry=0x7ffd4d654160, remaining=remaining@entry=0x7ffd4d654160)
    at ../sysdeps/unix/sysv/linux/nanosleep.c:28
#1  0x00007f6dde72467a in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#2  0x000055bd80ca7824 in work (args=0x7ffd4d6541f0) at rar2fs.c:5070
#3  main (argc=<optimized out>, argv=<optimized out>) at rar2fs.c:5355
(gdb) Quit
hasse69 commented 2 years ago

No the problem is that what you see in gdb is the main thread, that is simply sleeping causing 0 CPU. In gdb you need to list the threads (info threads) and change to them one by one using t and do bt. I thought gdb was more clever than this but apparently not.

To rebuild using --enable-debug you need to reconfigure project using the ./configure script and then do make, a new binary with all symbols intact will be placed under the src directory.

hasse69 commented 2 years ago

Also, you might have attached to the wrong pid. A process that spawns additional threads and/or child processes will all show the same parent pid but different child pids. To check this use

top -bHc -n 1

The PID that is now presented in the leftmost column is what you should attach to and strace.

philnse commented 2 years ago

I hope I recompiled correctly with ./configure --enable-debug && make && make install

I killed all processes and remounted with the recompiled version. With your notes about gdb I'm hopefully able to provide some useful information next time!

hasse69 commented 2 years ago

Well the problem is that make install will strip the binary.

philnse commented 2 years ago

Thats what I found on stackoverflow. Well, as I said before, consider me a newbie.

hasse69 commented 2 years ago

No problem, just skip make install and copy the binary from src directory to where ever you want it.

philnse commented 2 years ago

So I recompiled with ./configure --enable-debug && make and copied the just compiled binary from src where the old one was (in my case /user/local/bin). I'll get back to you as soon as I have some information!

hasse69 commented 2 years ago

Btw, you mentioned issue #11 and I can see also issue #64 has been discussing this. In both cases the removal of the specific support for RAR in RAR resolved the issue as far as I can tell. So, my guess is that this is something that has not been observed (or) reported before.

philnse commented 2 years ago

So this issues adressed in #11 and #64 are fixed with newer versions as they are from 2015 and 2017? Couldn't figure out what might be wrong with my RAR out of those posts.

Haven't had a lockup since my last post. When I replaced the binary five days ago I also noticed another old instance of rar2fs in /bin/ which I deleted. Could that maybe have been a problem, too?

hasse69 commented 2 years ago

Not saying it is the same issue, but it might be relevant for further troubleshooting. The issues listed all had the same common problem with RAR files inside another RAR file. That support was removed long ago and stacking of mounts is replacing that functionality. Wether your old binary in /bin would cause a problem or not is hard to tell but should not matter unless you do not explicitly call rar2fs in its location (eg /usr/bin/rar2fs) and completely rely on your PATH configuration and /bin sits before your other bin folder(s).

hasse69 commented 2 years ago

If you would like to push it I would try a complete PMS re-indexing. That would put rar2fs and your system under a major stress test.

philnse commented 2 years ago

I did that by updating the metadata for three libraries and by analyzing them and readding one test library and creating all thumbnails. So far all corresponding processes work as they should and never exceed 0,5% CPU load. They have their subprocesses, but I figure that is normal with workload like that?

hasse69 commented 2 years ago

Yes I would say it is normal and expected behavior. Let's not close this issue yet and allow it to run a few days more. I am a bit surprised though that it would suddenly just work? It is the same version for which you saw the issue, right?

philnse commented 2 years ago

So it seems to be the rar within rar issue. I found one process by now which seems to be locked up. The directory has another rar in it. So I guess you where right! I don't know why it has taken so long...

#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f17ecd4e0f4 in __GI___pthread_mutex_lock (mutex=0x7f17edb72990 <_rtld_global+2352>) at ../nptl/pthread_mutex_lock.c:115
#2  0x00007f17ecab8cfb in __GI___dl_iterate_phdr (callback=0x7f17ecf747f0, data=0x7f17eb070f10) at dl-iteratephdr.c:40
#3  0x00007f17ecf75aa1 in _Unwind_Find_FDE () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#4  0x00007f17ecf72183 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#5  0x00007f17ecf73360 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#6  0x00007f17ecf7385e in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7  0x00007f17ed415d47 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x000055e3287f5205 in ErrorHandler::Throw(RAR_EXIT) ()
#9  0x000055e3287f4ac1 in ComprDataIO::UnpWrite(unsigned char*, unsigned long) ()
#10 0x000055e3287ffa44 in Unpack::UnpWriteData(unsigned char*, unsigned long) ()
#11 0x000055e32880012c in Unpack::UnpWriteBuf30() ()
#12 0x000055e3287fb0cf in CmdExtract::ExtractCurrentFile(Archive&, unsigned long, bool&) ()
#13 0x000055e32880eb3c in ProcessFile(void*, int, char*, char*, wchar_t*, wchar_t*) ()
#14 0x000055e3287e02f8 in extract_rar (
    arch=0x7f17e00b3c40 "..."..., file=0x7f17e00b37d0 "...", arg=0x7) at rar2fs.c:1986
#15 0x000055e3287dd75d in popen_ (entry_p=0x7f17e00b3b40, cpid=0x7f17eb07dabc) at rar2fs.c:748
#16 0x000055e3287e5a29 in rar2_open (
    path=0x7f17e0139190 "...", fi=0x7f17eb07dce0) at rar2fs.c:3875
#17 0x00007f17ed717a40 in fuse_fs_open () from /lib/x86_64-linux-gnu/libfuse.so.2
#18 0x00007f17ed717b22 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#19 0x00007f17ed721f9c in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#20 0x00007f17ed7216c1 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#21 0x00007f17ed71de68 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#22 0x00007f17ecd4b6db in start_thread (arg=0x7f17eb07e700) at pthread_create.c:463
#23 0x00007f17eca7471f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
hasse69 commented 2 years ago

What version of libunrar do you use? Looks like an exception in thrown but not picked up and then it hangs.

Also do you know which archive this is? In that case can you provide me a link to where I might be able to download it. You can send me send the link in private if you wish.

So you use stacked mounts to reach the second level of RAR contents, right?

hasse69 commented 2 years ago

Also, can you elaborate on what you mean by the directory containing another RAR? That should not be an issue. What I am referring to is a RAR archive that contains another RAR archive.

hasse69 commented 2 years ago

I found this very interesting article

https://www.arangodb.com/2019/09/when-exceptions-collide/

I can not say for sure it is related but it could be. Do we know anything more about your system? Is it using a fairly recent version of the gcc runtime? Since you seem to be rather successful in reproducing this, can we first check if this is even remotely possible to be a similar issue by doing

ldd <your rar2fs binary>

And post the output here.

hasse69 commented 2 years ago

Also what version are you now running when you dumped the stack? It is not latest at least because line numbers in rar2fs.c does not match at all. Did you really run what was compiled and the binary copied from src/rar2fs?

philnse commented 2 years ago

The version of libunrar.so in /usr/lib should be based on unrarsrc-5.5.6 which I compiled with rar2fs 1.28. I don't now how to get the version number of libunrar after compiling. That's what is currently running I guess. The version of gcc is 4:7.4.0-1ubuntu2.3 About the rar within rar. Nevermind, thats not the case, i got that wrong!

Here's the output of ldd rar2fs:

        linux-vdso.so.1 (0x00007ffcf778e000)
        libfuse.so.2 => /lib/x86_64-linux-gnu/libfuse.so.2 (0x00007f5f089a2000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f5f08619000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f5f08411000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f5f081f9000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f5f07fda000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5f07be9000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f5f079e5000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5f07647000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f5f08e49000)

Might be offtopic: After a reboot I constantly have to reinstall libcurl4 and curl for some instances to run properly. I can't reproduce thise, but might be a sign that there is some major configuration issue on my side...

hasse69 commented 2 years ago

Show the output from rar2fs -V That will give you all version available.

But what version of libunrar you have in /usr/lib is irrelevant since as you can se from ldd output it is linked statically with rar2fs. So what version of unrar source did you use when compiling? Also try to use latest version of rar2fs on master since using old versions make trouble shooting harder. So go with v1.29.5 if you can.

Let's see if you can reproduce again. Looking at the article I linked to would indicate that using dynamic linking of libunrar might circumvent the issue if we are in fact dealing with the same gcc runtime bug here.

philnse commented 2 years ago
rar2fs v1.28.0 (DLL version 8)    Copyright (C) 2009 Hans Beckerus
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it under
certain conditions; see <http://www.gnu.org/licenses/> for details.
FUSE library version: 2.9.7
fusermount version: 2.9.7
using FUSE kernel interface version 7.19

So it's version 1.28.0 I compiled with unrarsrc-5.5.6 I'll switch back to the newest version of rar2fs and compile it with unrarsrc 6.0.7. and --enable-debug unless you're consider other version a better choice.

hasse69 commented 2 years ago

What we wish to accomplish now is a new hanging process and then we can recompile using a shared library instead to see if that makes a difference. Since the gcc version you are using is rather old it is at least a plausible scenario.

philnse commented 2 years ago

So this time it appeared rather quickly:

(gdb) bt
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f5d7987e0f4 in __GI___pthread_mutex_lock (
    mutex=0x7f5d7a6a2990 <_rtld_global+2352>)
    at ../nptl/pthread_mutex_lock.c:115
#2  0x00007f5d795e8cfb in __GI___dl_iterate_phdr (callback=0x7f5d79aa47f0,
    data=0x7f5d727ef0d0) at dl-iteratephdr.c:40
#3  0x00007f5d79aa5aa1 in _Unwind_Find_FDE ()
   from /lib/x86_64-linux-gnu/libgcc_s.so.1
#4  0x00007f5d79aa2183 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#5  0x00007f5d79aa3360 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#6  0x00007f5d79aa385e in _Unwind_RaiseException ()
   from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7  0x00007f5d79d3dd47 in __cxa_throw ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00005608ee7606a5 in ErrorHandler::Throw(RAR_EXIT) ()
#9  0x00005608ee760001 in ComprDataIO::UnpWrite(unsigned char*, unsigned long)
    ()
#10 0x00005608ee76b624 in Unpack::UnpWriteData(unsigned char*, unsigned long)
    ()
#11 0x00005608ee76bcd8 in Unpack::UnpWriteBuf30() ()
#12 0x00005608ee7667ad in CmdExtract::ExtractCurrentFile(Archive&, unsigned long                                                                                                         , bool&) ()
#13 0x00005608ee77abdc in ProcessFile(void*, int, char*, char*, wchar_t*, wchar_                                                                                                         ---Type <return> to continue, or q <return> to quit---

I guess its a problem with gcc? Should I try to use a newer one? Edit: I installed gcc version 9.4.0 now. I'll try compiling with that version, if don't have any other suggestions Edit 2: So just I just recompiled with gcc version 9,4,0 and the filesize of compiled rar2fs in src was different. Stressing again overnight and maybe (hopefully not) we can continue tomrrow!

hasse69 commented 2 years ago

Good, so we can confirm signature of the issue is the same. So rebuild again using --disable-static-unrar

And now you need to make sure that what you build against in your unrar source is the same as installed in e.g. /usr/lib. Use ldd to check which libunrar.so path is being used.

philnse commented 2 years ago

Here you go:

        linux-vdso.so.1 (0x00007fffc9535000)
        libfuse.so.2 => /lib/x86_64-linux-gnu/libfuse.so.2 (0x00007fe1c03a7000)
        libunrar.so => /usr/lib/libunrar.so (0x00007fe1c014e000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fe1bff46000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fe1bfb39000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fe1bf921000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe1bf702000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe1bf311000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe1bf10d000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe1bed6f000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe1c0803000)

The lockup was massive by the way. Had several sub-processes as seen in the first post

hasse69 commented 2 years ago

Ok so you could recreate this after building with statically linked libunrar?

philnse commented 2 years ago

So far everything is working fine with the statically linked libunrar. The mounts are currently stressed the usual way. Could this be a fix already? Would this mean there's some kind of misconfiguration in the system?

hasse69 commented 2 years ago

Now I am a bit confused. Last build I asked for you to do was with disabling static build.

philnse commented 2 years ago

I compiled 1.29.5 with unrar-src-6.0.7 using ./configure --enable-debug --disable-static-unrar && make and copied it from src to /usr/local/bin which is running for two hourse now. I posted ldd output for that session. Sorry for the confusion.

philnse commented 2 years ago

I just checked and can provide this from a locked process:

(gdb) bt
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f09ca2010f4 in __GI___pthread_mutex_lock (mutex=0x7f09cb302990 <_rtld_global+2352>) at ../nptl/pthread_mutex_lock.c:115
#2  0x00007f09c9f6bcfb in __GI___dl_iterate_phdr (callback=0x7f09ca428160, data=0x7f09c37f10f0) at dl-iteratephdr.c:40
#3  0x00007f09ca429181 in _Unwind_Find_FDE () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#4  0x00007f09ca4253a8 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#5  0x00007f09ca4265f0 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#6  0x00007f09ca426b45 in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7  0x00007f09ca6d67e8 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007f09cac61cd5 in ErrorHandler::Throw(RAR_EXIT) () from /usr/lib/libunrar.so
#9  0x00007f09cac61631 in ComprDataIO::UnpWrite(unsigned char*, unsigned long) () from /usr/lib/libunrar.so
#10 0x00007f09cac70194 in Unpack::UnpWriteData(unsigned char*, unsigned long) () from /usr/lib/libunrar.so
#11 0x00007f09cac70848 in Unpack::UnpWriteBuf30() () from /usr/lib/libunrar.so
#12 0x00007f09cac6a81d in CmdExtract::ExtractCurrentFile(Archive&, unsigned long, bool&) () from /usr/lib/libunrar.so
#13 0x00007f09cac7ff4c in ProcessFile(void*, int, char*, char*, wchar_t*, wchar_t*) () from /usr/lib/libunrar.so
#14 0x00005575748e49a0 in extract_rar (
    arch=0x7f09bc099990 "/home/[...]/Subs/..."..., file=0x7f09bc09ebd0 "[...].idx", arg=0x8)
    at rar2fs.c:2075
#15 0x00005575748e1aef in popen_ (entry_p=0x7f09bc09cc00, cpid=0x7f09c37fdac4) at rar2fs.c:728
#16 0x00005575748ea403 in rar2_open (
    path=0x7f0998001930 "[...]"..., fi=0x7f09c37fdce0) at rar2fs.c:4030
#17 0x00007f09caea7a40 in fuse_fs_open () from /lib/x86_64-linux-gnu/libfuse.so.2
#18 0x00007f09caea7b22 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#19 0x00007f09caeb1f9c in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#20 0x00007f09caeb16c1 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#21 0x00007f09caeade68 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#22 0x00007f09ca1fe6db in start_thread (arg=0x7f09c37fe700) at pthread_create.c:463
#23 0x00007f09c9f2771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
hasse69 commented 2 years ago

Ok it was a long shot. I would still think this is an issue with exception handling since it seems to get stuck in the gcc runtime the same way as the article I linked to describes.

philnse commented 2 years ago

Hmm, so nothing I can handle or fix as a layman I guess. Is there a way of using a binary instead of compiling myself, if that is the problem?

hasse69 commented 2 years ago

It is not about how you compile. This is a bit problematic since I really doubt this is caused by rar2fs directly, possibly indirectly since it uses the C++ unrar library. So currently I have no clue really.

andree392 commented 2 years ago

can confirm this happens to me as well. running 1.29.5, tried reverting to 1.29.4 same issue.

seems specific to subpacks.. all my stuck rar2fs are subpacks.

Strange, i havent had this issue before and some of these files are quite old. atleast it hasn't been noticeable before.

running debian 10.10 (4.19.0-17-amd64)

ldd /usr/local/bin/rar2fs linux-vdso.so.1 (0x00007ffd4a39a000) libfuse.so.2 => /usr/local/lib/libfuse.so.2 (0x00007fbcf13a2000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fbcf139d000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fbcf1393000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fbcf120f000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fbcf108c000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fbcf1072000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fbcf104f000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fbcf0e8e000) /lib64/ld-linux-x86-64.so.2 (0x00007fbcf1655000)

and

rar2fs v1.29.4 (DLL version 8) Copyright (C) 2009 Hans Beckerus This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see http://www.gnu.org/licenses/ for details. FUSE library version: 2.9.3 fusermount version: 2.9.3 using FUSE kernel interface version 7.19

output from gdb

__lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103 103 ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory. (gdb) bt

0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103

1 0x00007f22b806f7d1 in __GI___pthread_mutex_lock (mutex=0x7f22b862b990 <_rtld_global+2352>) at ../nptl/pthread_mutex_lock.c:115

2 0x00007f22b7fd7a7f in __GI___dl_iterate_phdr (callback=0x7f22b809a0b0, data=0x7f2283ff20e0) at dl-iteratephdr.c:40

3 0x00007f22b809b361 in _Unwind_Find_FDE () from /lib/x86_64-linux-gnu/libgcc_s.so.1

4 0x00007f22b8097a43 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1

5 0x00007f22b8098c20 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1

6 0x00007f22b809911e in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1

7 0x00007f22b82b7b27 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6

8 0x0000560d33ccb714 in ErrorHandler::Throw(RAR_EXIT) ()

9 0x0000560d33ccb0f1 in ComprDataIO::UnpWrite(unsigned char*, unsigned long) ()

10 0x0000560d33cd5fb4 in Unpack::UnpWriteData(unsigned char*, unsigned long) ()

11 0x0000560d33cd6668 in Unpack::UnpWriteBuf30() ()

12 0x0000560d33cd126d in CmdExtract::ExtractCurrentFile(Archive&, unsigned long, bool&) ()

13 0x0000560d33ce4c39 in ProcessFile(void, int, char, char, wchar_t, wchar_t*) ()

14 0x0000560d33cb7cf9 in extract_rar (arch=, file=0x7f22ac4068b0 "00107.track_4608_exp.sub", arg=) at rar2fs.c:2075

15 0x0000560d33cba8f2 in popen_ (entry_p=0x7f22ac4067b0, entry_p=0x7f22ac4067b0, cpid=) at rar2fs.c:728

16 rar2_open (

path=0x7f2278071190 "/path/00107.track_4608_exp.sub", fi=) at rar2fs.c:4025

17 0x00007f22b83c7c61 in fuse_compat_open (fi=0x7f2283ffed00,

path=0x7f2278071190 "/path/00107.track_4608_exp.sub", fs=0x560d344c8990) at fuse.c:1473

18 fuse_fs_open (fs=0x560d344c8990,

path=0x7f2278071190 "/path/00107.track_4608_exp.sub", fi=fi@entry=0x7f2283ffed00) at fuse.c:1738

19 0x00007f22b83c895c in fuse_lib_open (req=0x7f227815c450, ino=187343, fi=0x7f2283ffed00) at fuse.c:3212

20 0x00007f22b83cfa59 in do_open (req=, nodeid=, inarg=) at fuse_lowlevel.c:1213

21 0x00007f22b83cef16 in fuse_ll_process_buf (data=0x560d344c8b50, buf=0x7f2283ffeeb0, ch=) at fuse_lowlevel.c:2441

22 0x00007f22b83cbc5b in fuse_do_work (data=0x7f227c402240) at fuse_loop_mt.c:117

23 0x00007f22b806cfa3 in start_thread (arg=) at pthread_create.c:486

24 0x00007f22b7f9d4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

poltergaiist commented 2 years ago

Hi, I can verify that I just tried deleting Sub-folders now when my rar2fs was stuck at 100% as well and it was stuck reading a sub-rar-file for a movie I haven't even opened. Hence I couldn't remove the file until I forcefully shut down rar2fs/ubuntu.