Open AbhilashBalaji opened 2 months ago
The receive call to the mImpl->mSocket.receive(...)
is blocking, that's the reason why there is a check for the amount of available bytes that can be read without blocking the calling thread before receiving them from the socket. If you would remove the this check the thread will block until the socket receives packets.
Is the UDPThread
resource that you use using the Main Thread
or does it spawn it's own thread? In your case I highly recommend letting it spawn it's own thread so all allocating and copying of buffers happen on a separate thread, this prevents the main thread from clogging when working with huge amounts of data.
I managed to crank up to ~100 VBAN streams (~40 mb/s) using this approach
the machine I am running this on is a m1 apple silicon machine profiling with instruments the apple tool, and on further troubleshooting it looks like the UDPServer::onProcess()
function assumes that there is only one UDP packet to parse every time it is called. There there are multiple packets that are read from the buffer ( 2 channels 8 streams in my current tests)
Here is the first call of the VBANPacketReceiver::packetReceived()
function
here is the second call
both the VBANPacketReceiver::packetReceived()
and UDPServer::onProcess()
assume its a single packet of UDP instead of multiple ones if I am not mistaken.
when I add
available_bytes = std::min(available_bytes, static_cast<size_t>(VBAN_PROTOCOL_MAX_SIZE));
there is no perceivable change as I am forcing a read of a single UDP packet from the socket and the audio quality does not deteriorate.
And The UDPThread
is on a separate thread.
well, I guess it will work as long as you keep sending the same sized UDP packets, I'm curious what will happen when those will vary in size with your solution.
I'm also not entirely sure what the problem is you're trying to solve here ? Is it just the available
call you want to avoid ? Where you running into issues ?
If you make these changes in the onProcess
function will become blocking and in turn, the UDPThread
will stall until there is a packet to be read thus not update any other UDPServer
or UDPClient
attached to it. I don't think this is a good solution. This might work in your case, I don't know, in any case when you go down this route you should make a separate UDPThread
for every UDPServer
and UDPClient
.
All the packets im sending is the same size VBAN_PROTOCOL_MAX_SIZE
to be specific. Would you suggest having multiple UDPServers ? as the packets are being sent externally from a separate application (Max Objects to be specific). I will try a separate thread for every stream instead of a single UDP server and let you know !
Hello all. @AbhilashBalaji just showed me exactly what is happening. Not sure if this issue doesn't belong in the timegroeneboom/napvban repo but here we are, and it is relevant for napudp as well The onProcess function now assumes that it is handling only one UDP packet at a time. Sometimes though the socket receives multiple UDP packets in between two calls of onProcess! (for example when receiving multiple vban streams, not to be confused with channels) In that case the UDPServer now wrongly assumes that all the received data belongs to one single packet and forwards it that way through the packetReceived signal. The root of this problem is that the asio UDP socket internally pastes the contents of all incoming UDP packets together and presents it to the user as one continuous stream of data. In most cases this is not likely to be a problem, except when bigger amounts of packets are being received continuously, like with multiple VBAN streams. Starting a different server for each stream with its own port number is not really a solution, as you don't want the user to have to deal with multiple port numbers and rather avoids the problem than solve it. I think a solution for VBAN is to write a specific VBANServer that takes over the functionality of the UDPServer but expects VBAN packets of a specific size and is able to parse multiple of these in a row within a single onProcess call.
Hello all. @AbhilashBalaji just showed me exactly what is happening. Not sure if this issue doesn't belong in the timegroeneboom/napvban repo but here we are, and it is relevant for napudp as well The onProcess function now assumes that it is handling only one UDP packet at a time. Sometimes though the socket receives multiple UDP packets in between two calls of onProcess! (for example when receiving multiple vban streams, not to be confused with channels) In that case the UDPServer now wrongly assumes that all the received data belongs to one single packet and forwards it that way through the packetReceived signal. The root of this problem is that the asio UDP socket internally pastes the contents of all incoming UDP packets together and presents it to the user as one continuous stream of data. In most cases this is not likely to be a problem, except when bigger amounts of packets are being received continuously, like with multiple VBAN streams. Starting a different server for each stream with its own port number is not really a solution, as you don't want the user to have to deal with multiple port numbers and rather avoids the problem than solve it. I think a solution for VBAN is to write a specific VBANServer that takes over the functionality of the UDPServer but expects VBAN packets of a specific size and is able to parse multiple of these in a row within a single onProcess call.
The available bytes returned by mImpl->mSocket.available(asio_error);
is the size of one UDP packet and the following filled buffer using mImpl->mSocket.receive(asio::buffer(buffer), 0, asio_error);
always is exactly that one UDP packet, regardless of the size of the packet.
I just tested a setup using an UDP Server with an UDP thread using the manual process call, sending multiple packets and then manually update the thread. The UDPServer then processes the first sent UDP packet nicely. (and following packets as well with each subsequence manual call to onProcess
)
So it doesn't matter at all whatever the size(s) of the UDP packet(s) are.
It's true that the onProcess
receives one packet each call, you could argue that since this happens on a separate thread this is should pose no problem, maybe the slots connected to packetReceived
are CPU heavy which causes the onProcess
call to not be able to catch up with the speed at which UDP packets are sent. In that case you can think about reducing the workload (or distributing the workload) that happens on the callbacks/slots connected to packetReceived
.
However, I think there's an argument to be made to retrieve all the packets in one onProcess
call like this, what do you guys think ? Still, if the thread is not able to process the amount of UDP packets sent packets will get dropped regardless....
void UDPServer::onProcess()
{
asio::error_code asio_error;
size_t available_bytes = mImpl->mSocket.available(asio_error);
while(available_bytes > 0)
{
// fill buffer
std::vector<uint8> buffer;
buffer.resize(available_bytes);
mImpl->mSocket.receive(asio::buffer(buffer), 0, asio_error);
if(!asio_error)
{
// make UDPPacket and forward packet to any listeners
UDPPacket packet(std::move(buffer));
std::lock_guard<std::mutex> lock(mMutex);
packetReceived.trigger(packet);
}
available_bytes = mImpl->mSocket.available(asio_error);
if(asio_error)
nap::Logger::error(*this, asio_error.message());
}
if(asio_error)
nap::Logger::error(*this, asio_error.message());
}
I got around the undefined behavior by rewriting the udp server for the specific module to define a specfic udp packet size and limiting the reciever socket buffer here : https://github.com/TimGroeneboom/napvban/pull/4
Hello, I'm currently working on scaling the VBAN protocol using NAP and found performance bottlenecks in the UDP server onProcess function https://github.com/napframework/nap/blob/01b24f1673bc4aaa99684aa36643451b457468a6/system_modules/napudp/src/udpserver.cpp#L132
when using a single UDP server with anything more than 0.5 MB/sec of data throughput the CPU utilisation of the asio::ip::udp::socket available() call drastically increases when I profile the application.
The buffer.resize() call then also allocates a large amount of memory further slowing down the process
the workaround i came up with was allocating the maximum size of the packet as defined by the protocol as defined in https://github.com/TimGroeneboom/napvban/blob/3c63323b4d6a8aead2c065f897a7862e9e25e63f/src/vban/vban.h#L31
like
so my question is , can I go ahead and skip the call for available bytes and assume there is a UDP packet number of bytes always available ? let me know if you need more information.