Closed narspt closed 1 year ago
Trying on XLX020, seems to be working nicely so far.
Does this fix the error where ambed shows following errors
Feb 7 06:44:49 xlxtranscode ambed[7539]: 0 of 0 packets lost
and xlxd this error ?
Feb 6 19:17:49 vps-46ac4de0 xlxd: ambed 71 of 38 packets timed out
No, not related at all to the fix in this PR... but... concerning "your" problem, I did check it before and actually the code to count ambed timedout packets does that in a very wrong way, if you look at ccodecstream.cpp, at end of CCodecStream::Task() you can see what it does is just: if not getting packets from ambed after timeout (400ms) then it just starts incrementing the supposed number of timedout packets like crazy every 5ms (due to the 5ms timeout of m_Socket.Receive above), without taking in account the actual number of packets sent to ambed and not even dropping (or sending them back as-is without transcoding) these "waiting for transcoding" packets from queue, but just counting again and again and again... then you get a very inflated and false number of timedout packets... Latency stats are also fully false values as all it does is calc time difference from last packet sent to ambed to some packet being received from ambed, not meaning at all that both are same packet... Also if I remember correctly if you have considerable latency between xlxd and ambed then for very short transmissions (just quick keying ptt) you will always get timedout packets count and the infamous -1000 on stats, it's "normal" due to stream being closed even before getting anything back from ambed. It's not easy to make it fully perfect because we can't easily rely on a packet id preserved after transcoding but I was working on an improvement, I did never submit it as a PR as I got no much time to fully test it and there was still something I wanted to improve but if you want to look/test check this branch: https://github.com/narspt/xlxd/commits/patch-21 (note that changes there will also bypass transcoding for timedout packets instead of freezing stream, so that users on the same mode can still communicate if ambed stops replying)
Thanks for your feedback. I would at least be happy if xlxd would recover from a timeout situation. Once a timeout situation happened it is completely broken. Most of our DStar clients connect using DCS protocol. As soon as the timeout happens DCS to DCS is broken.
To test the timeout fix I actually need to cherry pick these 2 commits, correct? https://github.com/narspt/xlxd/commit/d204bb61e570670aa9238ba0364d76849dfb50ce https://github.com/narspt/xlxd/commit/252ebe868b5d8eda1cc702d9e69002522b279578
I will try to find time over the next days to test it on xlx208.
Correct. Btw, I'm not fully sure what you mean exactly when you say it gets broken (just the actual stream or need xlxd restart?), if you have xlxd and ambed on different places I would suggest you also apply this fix https://github.com/LX3JL/xlxd/pull/222 and... if you eventually suspect of any issues on ambed side (as root cause for the timeouts) maybe also consider this one https://github.com/LX3JL/xlxd/pull/226
What I meant with "broken" is that once a transcoder timeout occured, xlxd is in some sort of half crashed state. The most obvious symptom is DCS clients no longer communicating together.
Hi Geoffrey,
The XLX is not "brocken" simply restart the ambed service or server and all is working back as expected. A restart of the XLX is not necessary.
73, Luc
Hi Luc,
I've run into this issue countless times, restarted ambed but this is not enough. xlxd is still running but DCS clients are unable to communicate i.e. when 2 repeaters are connected through DCS a station transmitting on repeater A will not be forwarded to repeater B. I think I'll open a separate issue for that.
73,Geoffrey
Geoffrey Merck - F4FXL / KC3FRA http://www.f4fxl.org http://www.f5kav.fr
Le dim. 12 févr. 2023 à 08:32, LX1IQ @.***> a écrit :
Hi Geoffrey,
The XLX is not "brocken" simply restart the ambed service or server and all is working back as expected. A restart of the XLX is not necessary.
73, Luc
— Reply to this email directly, view it on GitHub https://github.com/LX3JL/xlxd/pull/213#issuecomment-1426962530, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIFHXTQ7FQ2IN36QLSVCHTWXCGYRANCNFSM5NV5TUDQ . You are receiving this because you commented.Message ID: @.***>
If xlxd gets in that half crashed state until restart then I bet you need just this fix: https://github.com/LX3JL/xlxd/pull/222
Applied a couple of your patches on XLX208, including that one.
@F4FXL: I'm curious if it did help with your problem? Btw, some of my patchs (incl. this one) are already merged on master.
It helped in some extent. Yet from time to time I still have the DCS to DCS issue.
No idea what is your problem then, never experienced it, sorry :(
This small patch fixes frame sequence applied by UpdatePids() (i.e. for streams initiated on different mode), m_uiPacketCntr should not be incremented by dv header packet else sequence for dv frames will start at 1 instead of 0 as it should.
This caused even further issues for DMR/YSF: with m_uiDmrPacketSubid and m_uiYsfPacketSubId also being calculated from the above, actually if we have for example a stream initiated on D-Star, 1st and 2nd dv frames would be transcoded to 2nd and 3rd AMBE frames of the DMR "triplet", the 1st AMBE frame would be the one stored at m_StreamsCache from the previous stream! and... frame sequence on EncodeDvPacket() being picked from that DvFrame0 (the one from previous stream), then something random...
The wrong frame sequence caused all DMR streams (when initiated on different mode, else original sequence is preserved) to show up on MMDVMHost as having packet loss.
(m_uiYsfPacketFrameId and m_uiImrsPacketFrameId are adjusted to match the change, as these should remain 0 for header packet and 1, 2, ... for dv packets)