ElvishArtisan / rivendell

A full-featured radio automation system targeted for use in professional broadcast and media environments
205 stars 64 forks source link

RDCatchd: Feed upload events stopped working. #889

Closed alexolivan closed 1 year ago

alexolivan commented 1 year ago

Hi.

Being somehow busy in other stuff I realize now that, every scheduled Recording Upload into podcast feed has stopped working. Checking for the last successful upload on the feeds throws date 19/May... However, I don't update the VM in a daily basis, so I'm not really sure on which commit/snapshot deb package the issue was introduced.

On RDCatch, the events are all marked pink as failed, and I have tried to manually schedule a one-shot copy of one of them... I post here what I consider the relevant log activity at the time of the event going wrong:

Jun 12 10:29:40 tgnradio rdcatchd: using web service URL: http://127.0.0.1/rd-bin/rdxport.cgi
Jun 12 10:29:41 tgnradio rdxport.cgi: POST is too large 'rdxport.cpp' line 120
Jun 12 10:29:41 tgnradio rdcatchd: using web service URL: http://127.0.0.1/rd-bin/rdxport.cgi
Jun 12 10:29:41 tgnradio ripcd: added new connection 24
Jun 12 10:29:41 tgnradio rdxport.cgi: processing RDXPORT_COMMAND_REMOVE_PODCAST
Jun 12 10:29:41 tgnradio rdxport.cgi: No such podcast 'podcasts.cpp' line 347
Jun 12 10:29:41 tgnradio ripcd: closed connection 24
Jun 12 10:29:41 tgnradio ripcd: cleaned up connection 24
Jun 12 10:29:41 tgnradio rdcatchd: post of 061010_001 to feed "Trance" failed [Server returned nothing (no headers, no data)], id=257
Jun 12 10:29:41 tgnradio kernel: [73445.825343] rdcatchd[121541]: segfault at 61 ip 00007fcd8df4c009 sp 00007ffe559958b8 error 4 in libQt5Core.so.5.15.2[7fcd8dd04000+2fe000]
Jun 12 10:29:41 tgnradio kernel: [73445.825386] Code: d2 74 ee e9 7c ff ff ff b8 ff ff ff ff eb bc 66 90 48 8b 47 40 48 85 c0 75 07 45 31 c0 44 89 c0 c3 48 8b 50 08 48 85 d2 74 f0 <48> 8b 4a 10 41 b8 01 00 00 00 48 85 c9 75 e4 48 8b 48 08 48 85 c9
Jun 12 10:29:41 tgnradio ripcd: closed connection 26
Jun 12 10:29:41 tgnradio ripcd: cleaned up connection 26
Jun 12 10:29:59 tgnradio rdcatchd: executing macro cart: 50002
Jun 12 10:29:59 tgnradio ripcd: sent catch event: "CATCH tgnradio 3 0 3 119 0 0" to 239.192.255.72:20539
Jun 12 10:29:59 tgnradio ripcd: received rml: "MD 1 -10 1000!" from 127.0.0.1
Jun 12 10:29:59 tgnradio rdcatchd: clearing event_id: 119
Jun 12 10:29:59 tgnradio rdcatchd: HERE1

It was a long time since I didn't inspected the VM syslog, so maybe I'm spotting things that are actually right as wrong, but the 'POST is too large' line caught my attention. I also noticed that, even that a segfault happened, rdcatchd managed to survive (or be respawn) so the whole thing keeps playing stuff, etc... nice! There're also those 'HERE1' messages from time to time.

I've noticed a change on the Feed configuration form, which includes a new field 'CDN Purge Plugin' , which is set to 'None'. I have tried to update the configuration by pressing OK button on the Feeds Forms, trying to trigger an eventual update configuration on the back-end... So far, no errors reported on syslog or GUI. But apparently some breaking changes happened around that time.

Cheers.

I notice a POST is too large

ElvishArtisan commented 1 year ago

I'm spotting things that are actually right as wrong, but the 'POST is too large' line caught my attention.

This is likely due to the same underlying cause as #886, which was corrected in the v4.0.0rc5 release.

I've noticed a change on the Feed configuration form, which includes a new field 'CDN Purge Plugin' , which is set to 'None'. I have tried to update the configuration by pressing OK button on the Feeds Forms, trying to trigger an eventual update configuration on the back-end... So far, no errors reported on syslog or GUI.

Yes, there have been a number of optimization added to the podcast subsystem. One of them is that uploads of the feed XML will not occur unless it has actually changed since the last upload, so you may need to change something (a bit of text in the Channel configuration or one of the posted items) in order to force a new upload.

The CDN cache purge stuff is extremely vendor specific. The only CDN provider that is currently supported is Akamai. If you're just publishing to a 'garden variety' FTP or HTTP server with no user-facing cache infrastructure, then this feature provides nothing useful.

ElvishArtisan commented 1 year ago

Unable to reproduce the problem here.

alexolivan commented 1 year ago

Ok... I created some duplicated/one-shot events of my events, as I upgraded to rc5 yesterday, and I can confirm the actual upload did work, and the episode appears on the feed.

However, segfaults do keep appearing, so I may have spotted another issue... however, I'm not sure enough to open a separated issue... if you consider so, let me know and I'll do, so we keep the issues area in order. here's the log:

Jun 13 10:53:26 tgnradio ripcd: closed connection 23
Jun 13 10:53:26 tgnradio ripcd: cleaned up connection 23
Jun 13 10:53:26 tgnradio rdcatchd: using web service URL: http://127.0.0.1/rd-bin/rdxport.cgi
Jun 13 10:53:28 tgnradio ripcd: added new connection 22
Jun 13 10:53:28 tgnradio rdxport.cgi: processing RDXPORT_COMMAND_SAVE_PODCAST
Jun 13 10:53:28 tgnradio rdxport.cgi: saved podcast "/var/snd/000015_000498.mp3"
Jun 13 10:53:28 tgnradio ripcd: closed connection 22
Jun 13 10:53:28 tgnradio ripcd: cleaned up connection 22
Jun 13 10:53:28 tgnradio rdcatchd: using web service URL: http://127.0.0.1/rd-bin/rdxport.cgi
Jun 13 10:53:28 tgnradio ripcd: added new connection 22
Jun 13 10:53:28 tgnradio rdxport.cgi: processing RDXPORT_COMMAND_POST_PODCAST
Jun 13 10:53:31 tgnradio rdxport.cgi: posted podcast audio "/var/snd/000015_000498.mp3"
Jun 13 10:53:31 tgnradio ripcd: closed connection 22
Jun 13 10:53:31 tgnradio ripcd: cleaned up connection 22
Jun 13 10:53:31 tgnradio rdcatchd: using web service URL: http://127.0.0.1/rd-bin/rdxport.cgi
Jun 13 10:53:31 tgnradio ripcd: added new connection 22
Jun 13 10:53:31 tgnradio rdxport.cgi: processing RDXPORT_COMMAND_POST_RSS
Jun 13 10:53:33 tgnradio rdxport.cgi: posted RSS XML to "ftp://192.168.112.4/tgntranceclassics/Trance.rss"
Jun 13 10:53:35 tgnradio rdxport.cgi: posted RSS XML to "ftp://192.168.112.4/TGNRadio.rss"
Jun 13 10:53:35 tgnradio ripcd: closed connection 22
Jun 13 10:53:35 tgnradio rdcatchd: post of  to cast id 498 successful, id=258
Jun 13 10:53:35 tgnradio ripcd: cleaned up connection 22
Jun 13 10:53:35 tgnradio kernel: [161280.191100] rdcatchd[267803]: segfault at 20 ip 00007f8c63d2c000 sp 00007fffb2d523f8 error 4 in libQt5Core.so.5.15.2[7f8c63ae4000+2fe000]
Jun 13 10:53:35 tgnradio kernel: [161280.191127] Code: c0 74 a0 48 8b 50 28 48 85 d2 74 ee e9 7c ff ff ff b8 ff ff ff ff eb bc 66 90 48 8b 47 40 48 85 c0 75 07 45 31 c0 44 89 c0 c3 <48> 8b 50 08 48 85 d2 74 f0 48 8b 4a 10 41 b8 01 00 00 00 48 85 c9
Jun 13 10:53:35 tgnradio ripcd: closed connection 24
Jun 13 10:53:35 tgnradio ripcd: cleaned up connection 24

What I have experienced is that, at RDCatch GUI module, the event highlighted green, as it was running, never seems to get an update from the backend that it is finished/ended so the frontend handles/deletes it... it keeps like in an eternal running state. Closing RDCatch and opening it again does the trick.... maybe that's why the segfault points at libQt5Core.so , some pesky qt5 issue... but I'm not sure.

Cheers.

ElvishArtisan commented 1 year ago

Still unable to reproduce the problem here. Any chance that you could get a backtrace on the rdcatchd(8) segfault?

alexolivan commented 1 year ago

I guess I could kill rdcatchd and try to replace it with a gdb run... will try!

alexolivan commented 1 year ago

Closing this one... it only happens when the event executes while RDcatch (GUI) is opened (one shot event). Uploads are working consistently for 3 weeks now.

I'm seing some feed-back from more people using RDCatch, so eventually someone may stumble with this under an environment that can be reproduced.

Thanks.