pipelka / vdr-plugin-xvdr

DISCONTINUED - XVDR Plugin for VDR
GNU General Public License v2.0
43 stars 22 forks source link

segfault with latest VDR 2.0.3 #111

Open piotrasd opened 10 years ago

piotrasd commented 10 years ago

Hi 2 days ago im update VDR to latest 2.0.3, dvbapi, and XVDR to latest

and i get 2 times on this 2 days crash :(

VDR XVDR Server[14828]: segfault at 1f85 ip 0817ec60 sp 458231e8 error 4 in vdr[8048000+19e000]

(before on 2.1.1 and 2.0.1 could work weeks, months without any issue)

Latest log wich im found before crach from is info about EPG

Sep 30 13:09:06 unRAID vdr: [24356] epg data writer thread started (pid=14728, tid=24356, prio=low) Sep 30 13:09:06 unRAID vdr: [24356] epg data writer thread ended (pid=14728, tid=24356) Sep 30 13:19:07 unRAID vdr: [26052] epg data writer thread started (pid=14728, tid=26052, prio=low) Sep 30 13:19:08 unRAID vdr: [26052] epg data writer thread ended (pid=14728, tid=26052)

piotrasd commented 10 years ago

Hi @pipelka Im now testing VDR 2.1.3 and get segfault from XVDR but this time i maked debug

0x401b1752 in std::_Rb_tree_increment(std::_Rb_tree_nodebase) () from /usr/lib/libstdc++.so.6 (gdb) bt

0 0x401b1752 in std::_Rb_tree_increment(std::_Rb_tree_nodebase) () from /usr/lib/libstdc++.so.6

1 0x401b17ab in std::_Rb_tree_increment(std::_Rb_tree_nodebase const) () from /usr/lib/libstdc++.so.6

2 0x405c3757 in operator<<(std::basic_fstream<char, std::char_traits >&, cChannelCache const&) ()

from /usr/lib/vdr/libvdr-xvdr.so.2.1.3

3 0x405c61b8 in cChannelCache::SaveChannelCacheData() () from /usr/lib/vdr/libvdr-xvdr.so.2.1.3

4 0x405df2ea in cXVDRServer::Action() () from /usr/lib/vdr/libvdr-xvdr.so.2.1.3

5 0x0817fb18 in cThread::StartThread(cThread_) ()

6 0x40090830 in start_thread () from /lib/libpthread.so.0

7 0x4036415e in clone () from /lib/libc.so.6

Any idea what happens ?

pipelka commented 10 years ago

Did you do a

make clean

before building the addon ?

A build with debugging information (symbols, full backtrace) would give more information.

I just see that it segfaults at writing the channel cache to disk.

manio commented 10 years ago

I have same problem on VDR 2.1.3. I did bt and bt full:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd3fff700 (LWP 17556)]
0x00007ffff6c06ab7 in std::_Rb_tree_increment(std::_Rb_tree_node_base*) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
(gdb) bt
#0  0x00007ffff6c06ab7 in std::_Rb_tree_increment(std::_Rb_tree_node_base*) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00007ffff4760d2b in operator++ (this=) at /usr/include/c++/4.8/bits/stl_tree.h:197
#2  cChannelCache::SaveChannelCacheData () at src/live/channelcache.c:242
#3  0x00007ffff47751cd in cXVDRServer::Action (this=0x15093d0) at src/xvdr/xvdrserver.c:314
#4  0x000000000050283f in cThread::StartThread (Thread=0x15093d0) at thread.c:262
#5  0x00007ffff798ce0e in start_thread (arg=0x7fffd3fff700) at pthread_create.c:311
#6  0x00007ffff63bb0fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) bt full
#0  0x00007ffff6c06ab7 in std::_Rb_tree_increment(std::_Rb_tree_node_base*) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#1  0x00007ffff4760d2b in operator++ (this=) at /usr/include/c++/4.8/bits/stl_tree.h:197
        __tmp = {_M_node = 0x17983c0}
#2  cChannelCache::SaveChannelCacheData () at src/live/channelcache.c:242
        i = {_M_node = 0x17983c0}
        out = 
        filename = {_vptr.cString = 0x7799b0 , s = 0x7fffc4011530 "/var/cache/vdr/plugins/xvdr/channelcache.data.bak"}
        filenamenew = {_vptr.cString = 0x17, s = 0x7ffff6355d92 <__GI___strdup+34> "H\205\300t!H\211\352H\211\336H\213l$\020H\213\\$\bH\211\307H\203\304\030\351m\212"}
#3  0x00007ffff47751cd in cXVDRServer::Action (this=0x15093d0) at src/xvdr/xvdrserver.c:314
        bChanged = 
        r = 
        fd = 
        fds = {fds_bits = {0 }}
        channelReloadTrigger = false
        channelsHash = 29924087152697
        recStateOld = 7
        recordingReloadTimer = {begin = 430675167}
        recordingReloadTrigger = false
        tv = {tv_sec = 0, tv_usec = 0}
        channelReloadTimer = {begin = 430675866}
        channelCacheTimer = {begin = 430735546}
        recState = 7
#4  0x000000000050283f in cThread::StartThread (Thread=0x15093d0) at thread.c:262
No locals.
#5  0x00007ffff798ce0e in start_thread (arg=0x7fffd3fff700) at pthread_create.c:311
        __res = 
        pd = 0x7fffd3fff700
        now = 
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140736750155520, -6671873386401089517, 1, 140737347411168, 4096, 140736750155520, 6671829405356931091, 6671891191652431891}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = 0
        pagesize_m1 = 
        sp = 
        freesize = 
        __PRETTY_FUNCTION__ = "start_thread"
#6  0x00007ffff63bb0fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
No locals.
pipelka commented 10 years ago

Just scratching my head what could cause this, ... Does it happen while watching Live TV ?

manio commented 10 years ago

It happen when switching channels very fast. Maybe please hold on while there will be new developer version of VDR. Me and @piotrasd has VDR patched with yet unofficial patches which will be a part of new VDR. I am not sure - but it may be related. And after new version released it will be for sure, especially if more users will have and report this problem...

manio commented 10 years ago

@pipelka Hello, regarding new VDR version and changes: I've sent you email

pipelka commented 10 years ago

Thanks Mariusz,

i suppose the proposed changes will be in vdr 2.1.4 (?).

I do not fully understand yet how the ChannelChange notification will work (and on which occasions it will be fired).

As soon as these changes have landed in vdr i will adapt the xvdr code.

But it's good to know that there is something coming up ;-)

Thanks, Alex

manio commented 10 years ago

Yes, it should be in 2.1.4.

manio commented 10 years ago

Hello. It's now released: http://www.vdr-portal.de/index.php?page=Thread&postID=1183688

manio commented 10 years ago

@pipelka I created a proof-of-concept code for you: http://skyboo.net/xvdr-channelchange-poc.diff.

pipelka commented 10 years ago

@manio Thanks. In the meantime I created my own version. But I'm currently unable to test it because decryption only happens on my production system (i don't want to update it with a development version). My kids just would freak out if they can't watch their favorite TV shows anymore, ... ;-)

Could you please check if it works for you ? It's 2038cfd1da3130a2396e54669d27864315140e35 in the "channelchange" branch.

manio commented 10 years ago

I leave you inline comments

pipelka commented 10 years ago

Ok. Fully stopping / starting the streamer seems to heavy-weight for me.

Please recheck with the latest commit: fb5aa9460a8bf3588d797db2343c4ddb48f19f01 This should force the transfermode on the device.

manio commented 10 years ago

Yes, now it works as expected. Thank you! ;) And about this issue - I don't have this crash anymore (it was probably because of the patches we have applied, so I think this issue should be closed).

pipelka commented 10 years ago

Good news. Thanks for putting so much effort into this.

piotrasd commented 10 years ago

i get segfault from XVDR :(

Program received signal SIGSEGV, Segmentation fault. 0x081486c8 in cSectionHandler::Detach(cFilter_) () (gdb) bt

0 0x081486c8 in cSectionHandler::Detach(cFilter_) ()

1 0x080ba6ce in cDevice::Detach(cFilter_) ()

2 0x405cb909 in cLiveStreamer::ChannelChange(cChannel const_) () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

3 0x405d7e6b in cXVDRClient::ChannelChange(cChannel const_) () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

4 0x08175c54 in cStatus::MsgChannelChange(cChannel const_) ()

5 0x08192079 in main ()

and

[Switching to Thread 0x42affb70 (LWP 6945)] 0x401b1752 in std::_Rb_tree_increment(std::_Rb_tree_nodebase) () from /usr/lib/libstdc++.so.6 (gdb) bt

0 0x401b1752 in std::_Rb_tree_increment(std::_Rb_tree_nodebase) () from /usr/lib/libstdc++.so.6

1 0x401b17ab in std::_Rb_tree_increment(std::_Rb_tree_nodebase const) () from /usr/lib/libstdc++.so.6

2 0x405c3857 in operator<<(std::basic_fstream<char, std::char_traits >&, cChannelCache const&) () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

3 0x405c62b8 in cChannelCache::SaveChannelCacheData() () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

4 0x405df5aa in cXVDRServer::Action() () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

5 0x081800d8 in cThread::StartThread(cThread_) ()

6 0x40090830 in start_thread () from /lib/libpthread.so.0

7 0x4036415e in clone () from /lib/libc.so.6

im used VDR 2.1.4, dvbapi (master from today), XVDR git (channelchange branch)

pipelka commented 10 years ago

Please recheck with the latest "channelchange" branch. I added some more checking and locking.

piotrasd commented 10 years ago

almost 3 days everything was fine, and today i get segfault (sorry but i've run VDR without gdb)

VDR XVDR Server[7659]: segfault at 0 ip 40582596 sp 437231e0 error 4 in libvdr-xvdr.so.2.1.4[40562000+40000]

——

eb 1 18:59:15 unRAID vdr: [24113] buffer usage: 100% (tid=24112) Feb 1 18:59:16 unRAID vdr: [24113] ERROR: driver buffer overflow on device 1 Feb 1 18:59:20 unRAID last message repeated 2 times Feb 1 18:59:22 unRAID vdr: [24112] ERROR: skipped 123 bytes to sync on TS packet on device 1 Feb 1 18:59:23 unRAID vdr: [7661] XVDR: sync found at offset 416 (streamtype: MPEG2AUDIO / 2058 bytes in buffer / framesize: 576 bytes) Feb 1 18:59:24 unRAID vdr: [7661] XVDR: sync found at offset 160 (streamtype: MPEG2AUDIO / 2058 bytes in buffer / framesize: 576 bytes) Feb 1 18:59:24 unRAID vdr: [24113] ERROR: driver buffer overflow on device 1 Feb 1 18:59:26 unRAID vdr: [24113] ERROR: driver buffer overflow on device 1 Feb 1 18:59:26 unRAID vdr: [7661] XVDR: sync found at offset 416 (streamtype: MPEG2AUDIO / 2144 bytes in buffer / framesize: 576 bytes) Feb 1 18:59:28 unRAID vdr: [24113] ERROR: driver buffer overflow on device 1 Feb 1 18:59:28 unRAID vdr: [24113] buffer usage: 60% (tid=24112) Feb 1 18:59:29 unRAID vdr: [24112] ERROR: skipped 123 bytes to sync on TS packet on device 1 Feb 1 18:59:29 unRAID vdr: [24113] buffer usage: 70% (tid=24112) Feb 1 18:59:30 unRAID vdr: [24113] buffer usage: 80% (tid=24112) Feb 1 18:59:30 unRAID vdr: [7661] XVDR: sync found at offset 368 (streamtype: MPEG2AUDIO / 2194 bytes in buffer / framesize: 576 bytes) Feb 1 18:59:30 unRAID vdr: [24113] buffer usage: 90% (tid=24112) Feb 1 18:59:31 unRAID vdr: [24113] buffer usage: 100% (tid=24112) Feb 1 18:59:31 unRAID vdr: [7661] XVDR: sync found at offset 416 (streamtype: MPEG2AUDIO / 2058 bytes in buffer / framesize: 576 bytes) Feb 1 18:59:32 unRAID vdr: [24113] ERROR: driver buffer overflow on device 1 Feb 1 18:59:33 unRAID vdr: [7661] XVDR: sync found at offset 296 (streamtype: MPEG2AUDIO / 2194 bytes in buffer / framesize: 576 bytes) Feb 1 18:59:34 unRAID vdr: [24113] buffer usage: 60% (tid=24112) Feb 1 18:59:35 unRAID vdr: [24112] ERROR: skipped 123 bytes to sync on TS packet on device 1 Feb 1 18:59:35 unRAID vdr: [7661] XVDR: sync found at offset 480 (streamtype: MPEG2AUDIO / 2050 bytes in buffer / framesize: 576 bytes) Feb 1 18:59:35 unRAID vdr: [7661] cLiveStreamer stream processor thread ended (pid=23526, tid=7661) Feb 1 18:59:35 unRAID vdr: [7659] DVBAPI: 0.0 set CAM decrypt (SID 3107, caLm 5, HasCaDescriptors 0) Feb 1 18:59:35 unRAID vdr: [7660] XVDR: LiveQueue stopped Feb 1 18:59:35 unRAID vdr: [7659] buffer stats: 461540 (4%) used Feb 1 18:59:35 unRAID vdr: [7659] XVDR: -------------------------------------- Feb 1 18:59:35 unRAID vdr: [7659] XVDR: Channel streaming request: 14 - Polsat2 Feb 1 18:59:35 unRAID vdr: [7659] XVDR-Error: Can't get device for channel 14 - Polsat2 Feb 1 18:59:36 unRAID kernel: VDR XVDR Server[7659]: segfault at 0 ip 40582596 sp 437231e0 error 4 in libvdr-xvdr.so.2.1.4[40562000+40000]

im must tell, i have today very bad wather, so i have signal issue

UPDATE: im reproduce this segfault 4 times

(gdb) bt

0 0x405cb596 in cLiveStreamer::RequestSignalInfo() () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

1 0x405d74e4 in cXVDRClient::processChannelStream_Signal() () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

2 0x405dbdc0 in cXVDRClient::processRequest() () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

3 0x405dbfda in cXVDRClient::Action() () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

4 0x081800d8 in cThread::StartThread(cThread*) ()

5 0x40090830 in start_thread () from /lib/libpthread.so.0

6 0x4036415e in clone () from /lib/libc.so.6

this is what i do.

Im use now only 1 tuner im recorded channel from TP - X, and start watching another channel from TP - X when recording and live channel from TP - X running - them i try switch to channels from TP - Y after few try back to channel from TP - X and switch to TP - Y i get segfault

pipelka commented 10 years ago

Ok. Thanks. I'm currently running an automated long-term test. I already have a couple of ideas how this can happen.

Give me a couple of days, ...

pipelka commented 10 years ago

Please check if 173dd950abd8487d0348182b2394f35ced746a7d fixes your problem.

piotrasd commented 10 years ago

im testing build from your latest commit from branch Attach

eveything looks very promising :)

piotrasd commented 10 years ago

still everything is fine, example i can record something and watch another channels from this same TP, only XBMC dont show me message "all devices busy" so i can switch to channels from other TP in XBMC but i have black picture, in VDR log i get

Feb 9 22:25:12 unRAID vdr: [8072] XVDR: Channel streaming request: 67 - AXN Feb 9 22:25:12 unRAID vdr: [8072] XVDR-Error: Can't get device for channel 67 - AXN

(anyway no deadlock and segfaults)

pipelka commented 10 years ago

That's part of the deal ;-) If you stay at that channel and the recording stops you will get a picture.

XBMC notification aren't implemented, yet.

pipelka commented 10 years ago

If @manio agrees I will squash the changes from the "attach" branch, remove the log-spam and merge it into the "channelchange" branch. After that all testing should be done from the "channelchange" branch.

manio commented 10 years ago

Good idea, please also totally revert this: e4c2d7b33069a7a I believe that the problem was solved in earlier commit. If I am wrong, we at leat know the missing commit :) btw. any plans to merge it all to master?

pipelka commented 10 years ago

Yes all the log stuff will be ripped. I already have this version running on my production system. The final step will be to pull it into "master".

pipelka commented 10 years ago

Alright. I squashed it together and added status notifcation messages. It's now in the "channelchange" branch. I had to overwrite the branch so please reset the branch or simply re-clone.

piotrasd commented 10 years ago

hi Im use this branch channelchange after reclone and i get today some new segfault

epg data writer[15168]: segfault at 9ef458 ip 080e01ce sp 457a6180 error 4 in vdr[8048000+1a9000]

I will be now try reproduce with gdb

fantomkas commented 10 years ago

I have this trouble too

[ 3519.512507] VDR XVDR Server[5287]: segfault at ff10 ip 00007fe7c54dad33 sp 00007fe7897f9a78 error 4 in libstdc++.so.6.0.17[7fe7c5466000+e5000] [ 3522.513193] init: vdr main process (5212) killed by SEGV signal [ 3522.513228] init: vdr main process ended, respawning [ 3522.524395] init: oscam main process (5228) terminated with status 15 [ 3703.789955] traps: VDR XVDR Server[5906] general protection ip:7ff5bc975d42 sp:7ff5797f9a78 error:0 in libstdc++.so.6.0.17[7ff5bc901000+e5000] [ 3705.804680] init: vdr main process (5831) killed by SEGV signal [ 3705.804717] init: vdr main process ended, respawning [ 3705.815484] init: oscam main process (5847) terminated with status 15 [ 5390.339124] VDR XVDR Server[6211]: segfault at 110 ip 00007fc109dacd33 sp 00007fc0ce7fba78 error 4 in libstdc++.so.6.0.17[7fc109d38000+e5000] [ 5393.340257] init: vdr main process (6136) killed by SEGV signal [ 5393.340295] init: vdr main process ended, respawning [ 5393.352557] init: oscam main process (6152) terminated with status 15 [ 5442.656616] VDR XVDR Server[7558]: segfault at ff20 ip 00007f1f985ed113 sp 00007f1f4dffab50 error 4 in libvdr-xvdr.so.2.1.4[7f1f985cc000+40000] [ 5442.833192] init: vdr main process (7470) killed by SEGV signal [ 5442.833213] init: vdr main process ended, respawning [ 5443.817393] init: oscam main process (7486) terminated with status 15 [ 6285.422873] VDR XVDR Server[7685]: segfault at 100ff8150f8 ip 00007f43852f1d42 sp 00007f43457f9a78 error 4 in libstdc++.so.6.0.17[7f438527d000+e5000] [ 6287.434902] init: vdr main process (7623) killed by SEGV signal [ 6287.434940] init: vdr main process ended, respawning [ 6287.445466] init: oscam main process (7688) terminated with status 15 [ 6408.524094] VDR XVDR Server[8431]: segfault at 10133a8 ip 00007fc3c9f9bd42 sp 00007fc37e7fba78 error 4 in libstdc++.so.6.0.17[7fc3c9f27000+e5000] [ 6411.523592] init: vdr main process (8356) killed by SEGV signal [ 6411.523629] init: vdr main process ended, respawning [ 6411.535962] init: oscam main process (8372) terminated with status 15 [ 6472.578415] VDR XVDR Server[8622]: segfault at 201d3d8 ip 00007fba60fa7d42 sp 00007fba1d7f9a78 error 4 in libstdc++.so.6.0.17[7fba60f33000+e5000] [ 6473.589930] init: vdr main process (8547) killed by SEGV signal [ 6473.589964] init: vdr main process ended, respawning [ 6473.600768] init: oscam main process (8563) terminated with status 15 [ 6594.502182] VDR XVDR Server[8742]: segfault at ffbb84a8 ip 00007f68da01ed42 sp 00007f689e7fba78 error 4 in libstdc++.so.6.0.17[7f68d9faa000+e5000] [ 6596.514693] init: vdr main process (8667) killed by SEGV signal [ 6596.514732] init: vdr main process ended, respawning [ 6596.526994] init: oscam main process (8683) terminated with status 15 [46438.281415] init: oscam main process (8842) terminated with status 15

manio commented 10 years ago

@pipelka I am testing it for several days and I don't have any crash related with XVDR :) During this time I only had one when stopping the plugin (the one related with 'pure virtual function called'). Unfortunately I don't have a bt for this.

@piotrasd you also don't have an XVDR crash for several days, right?

@fantomkas We could only suspect the problem (based on previous crash reports). I think that Alex would need at least a backtrace for your crashes. I also assume that you're using latest VDR 2.1.4 and my master plugin branch, right?

@all I am providing two Klaus' VDR patches which could help in a case of problems and may be included in next VDR version: http://skyboo.net/logs/patches/ We're still investigating one another problem (probably in VDR).

piotrasd commented 10 years ago

yes, i was running over 7 days with gdb, and very hard try to crash VDR, but now for me XVDR and VDR is very stable - even if i record something try forced switch channels, etc. nothing :) just good work

pipelka commented 10 years ago

@manio I just pushed 0e73eb0c0cdf06c27d6bdb7d38177b9fd1850894. It should fix an issue on shutting down the plugin.

@fantomkas Maybe I'm wrong. But I think your segfault has nothing to do with the issue covered in this thread. Please tell me which versions you are using (vdr, xvdr, dvbapi) and when this segfault happens (zapping, idle, shutdown). A more detailed log would also be helpful. Maybe you manage to grab a gdb backtrace.

piotrasd commented 10 years ago

i get 3 times today segfault :(

(gdb) bt

0 0x401b1752 in std::_Rb_tree_increment(std::_Rb_tree_nodebase) () from /usr/lib/libstdc++.so.6

1 0x401b17ab in std::_Rb_tree_increment(std::_Rb_tree_nodebase const) () from /usr/lib/libstdc++.so.6

2 0x4057aa57 in operator<<(std::basic_fstream<char, std::char_traits >&, cChannelCache const&) () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

3 0x4057d4b8 in cChannelCache::SaveChannelCacheData() () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

4 0x40596f68 in cXVDRServer::Action() () from /usr/lib/vdr/libvdr-xvdr.so.2.1.4

5 0x081800d8 in cThread::StartThread(cThread*) ()

6 0x40090830 in start_thread () from /lib/libpthread.so.0

7 0x4036415e in clone () from /lib/libc.so.6

VDR 2.1.4 (+ 2 @manio patch from klaus) + dvbapi latets master + xvdr branch channelchange latest i must say i updated 2 days ago, add twin tuner(last week i only tested on single)

after this segfault with bt, and reproduce this one more time without problem (on switching/stoping channels)

tomorro i make more investigation (still i have build in packages before this update, so i can check if latest update something change, and i will test this same procedure for segfault with only single tuner)

pipelka commented 10 years ago

I merged all the changes from this branch (+ a couple of fixes) into master.

manio commented 10 years ago

great, thanks :)