Closed hrxcodes closed 3 years ago
Does not really look good, does it? ☹️ I thought I caught all these bugs already but apparently not. Any idea how to recreate this crash or is it simply random?
Good news is that is seems to be isolated to the __resolve_dir()
function which is then easier to track down than some asynchronous event and it seems to be a call to free()
that is bad. There are a few of those in this function and the one I suspect the most is related to an error case/branch in the code. If this would be a bug in the "normal" flow I think we would have caught it already.
When you try to reproduce. can you please run rar2fs in the foreground using the -f
flag? That way we can catch any output from rar2fs to stdout/stderr which might give us a clue to what path is not handled properly.
Or even better, run it through gdb
if you know how to do that? If you do please run the binary you compiled and not the one you install since the latter would become stripped from any symbols. Note that the -f
flag is needed also in this case.
Yeah it showed up from nowhere and happens quite regularly. So while I don't have instructions on how to reproduce, its hows up after a few hours or days (can happen multiple times a day, or work for a few days).
I will let you know as soon as I have something from gdb.
When it stops, do you want just a bt
of the current thread or also things like info locals
and info threds
?
Notes to self (and possibly others in the future):
sudo gdb ./rar2fs
handle SIG32 pass nostop noprint
r <rar2fs-arguments>
A stack trace and possibly a dump of some of the symbols/variables in the context of the crash would be enough for now. The value of the pointer being freed would be good. I suspect it to be a rogue pointer not initialized properly. Otherwise I would have expected some double free or corruption error. Calling free() with NULL is always ok.
In case you can reproduce, also try this attached patch on top of master/HEAD. I have my suspicions that patch might solve this issue.
So far it has been running smooth, but I suspect not for much longer. It's always like this when you want it to happen it doesn't. I'll let you know once it crashes and then try and update to the patch you supplied and run it in gdb again in case it would not solve the issue.
No news here? Still no crash to work with?
Annoying as it is it's suddenly working as expected, no crashes so far since I started it. I'm keeping an eye on it every day but it's solid so far even without the patch. I suggest we keep this open another week and if it does not happen we can close it. The longest I had it running for before was 12 days (been 11 days since I started it in gdb).
It's just annoying it can't be reproduced from having issues multiple times a day/week -> creates issue -> can't reproduce 😞 I will keep you updated as soon as there is something happening on my end
Let's just keep it open then.
When I studied the actual crash signature there are not many places it would blow up like this. I think the patch provided would cover that. It would be a rather rare situation for this to happen so you must have been very unlucky or lucky to spot it. Never the less, the problem is/was there. Just need to wait for it.
Still working perfectly fine curiously enough. I'll keep it running in gdb until it either crashes on me or I need to reboot the box.
I will get back when/if something bad happens, you may close the issue in the meanwhile if you want. Thank you for your work on rar2fs, it's great! 🍻
Despite the fact you have not been able to reproduce the issue I have chosen to merge the patch since it does no harm really. Let's close this issue but feel free to re-open when/if needed.
Issue
rar2fs is randomly stopped, I've managed to get the core dump of one of these failures. rar2fs has been working fine for the last 12+months before this issue started happening unrelated to any updates as far as I could tell.
Does the core dump tell you anything interesting? Memory at the time was as follows:
Since I'm seeing
malloc
in the core dump I started to suspect it's memory related but there is plenty of cache to take from. I'm kind of stuck so any ideas are welcome 😃 Thank you for your work on rar2fs!Versions
ldd
Logs
dmesg
journal
coredump