dheijl / swyh-rs

Stream What You Hear written in rust, inspired by SWYH.
MIT License
348 stars 15 forks source link

48khz sampling rate not working with 16 bit #108

Closed stewartsiu closed 11 months ago

stewartsiu commented 1 year ago

I'm running 1.8.4 in windows 11, sending to a renderer (gmediarender v1.42) with wired connection. I noticed that I'm only able to use the following formats without significant stutter or error: WAV 16/44.1 WAV 24/48 FLAC 24/48 (smooth but significant latency)

What I really want to use is WAV 16/48, as I was trying to stream youtube and the native audio codec is 48khz (Opus). Do we know why 16/44 and 24/48 work, but not 16/48? 24/48 is ok but I'm trying to save bandwidth and reduce stutters that always occur after a while.

[Edit: For future reference, the correct characterization is that all sample rates eventually drop/stutter, and 16/48 is just a bit worse than others]

dheijl commented 1 year ago

swyh-rs does not touch the sample rate, it receives samples from Windows as F32, converts them to the desired integer bit depth, and streams them immediately to the rendering device in 8KB chunks. Buffering is a function of the rendering device, and I suppose that gmediarender uses an input buffer that is too small in some cases. I have no idea if you can change the input buffer size in gmediarender.

FLAC is better because the compression saves bandwidth and the FLAC decoder is probably a library external from gmediarender anyway, but the compression introduces extra latency that can not be avoided. Gmediarender does not have a good reputation and seems to be unmaintained (source: https://github.com/hzeller/gmrender-resurrect).

stewartsiu commented 1 year ago

Thanks for the reply! If I understand correctly, the 8kB is the only additional tunable parameter from the client side? The gmediarenderer I'm using is a commercial fork (Gustard R26 streamer), and it works without any stutter at any sampling rate (e.g. 384k) if I use foobar2000's plugin foo_out_upnp (https://www.foobar2000.org/components/view/foo_out_upnp), so it seems like there should be a way to do the same thing with windows audio, with at worst more latency. Right now my 24/48 stream via swyh often starts to stutter in just a few minutes.

dheijl commented 1 year ago

When using foobar2000, are you playing local music files or are you using an internet streaming source ?

The 8 kb http streaming buffering is not controlled by swyh-rs, it is the http library that does it.

Does enabling/disabling chunked transfer make any difference?

Swyh-rs tries to minimize the delay, that is why there is no additional internal buffering, but this means that the receiving end is responsible for preventing stuttering by providing an adequate input buffer (and introducing some delay).

stewartsiu commented 1 year ago

Streaming with foobar2000's foo_out_upnp was with local music files. For chunked transfer - it actually did not work when the disable chunked transfer option was on, whatever sampling rate I used. For what it's worth, I looked at streaming_server.rs and just rebuilt your code with the (None, 8192) changed to some other random values to see what happens. If the stream size / chunked thresholds are too extreme it wouldn't work, but when the streaming does work with 24/48 wav, it still eventually starts to stutter after a few minutes, like it does with 16/48. In contrast there is no stuttering with foobar2000 whatsoever.

dheijl commented 1 year ago

So it looks like the input buffering of the Gustard has problems with the 8 kb buffersize that Rust uses in std::io when copying data (copy_to() and copy_from()).

stewartsiu commented 12 months ago

Two new observations:

  1. 16/48 actually works if I restart the streamer - it's just that once it starts stuttering / stopping, other sampling rates (even 24/384) works for a few mins if I restart swyh, but 16/48 still stutters if I restart swyh.
  2. When I try to restart swyh, if there's stuttering, I noticed that I would get to "Streaming to ... has ended" even though the variable nclients=1.

As an experiment I tried foo_out_upnp in between these stuttering situations with all sampling rates, without streamer restart, and it has no problems sending wav files to the streamer. Restarting swyh in those cases would not work unless I restart the streamer. So I suspect that foo_out_upnp is using the API differently from swyh-rs, but I don't know the upnp protocol so I don't have any idea what it could be. In case it's helpful I've attached the three service xml from my Gustarenderer in a zip file: render_xml.zip

Edit: Rereading dheijl's comment, maybe the conclusion is just that the incompatibility is inherent to Rust libraries? I don't really know Rust but would be interested in trying any changes you suggest.

dheijl commented 12 months ago

I don't think it has anything to do with the UPNP protocol, as everything on that side seems to work OK. It's the HTTP streaming that does not work as it should. And it all points to latency/buffering problems while streaming but I have no clue where to start looking. And fiddling with the streaming on the Rust side is not easy it would mean ripping out the HTTP server tiny-http and replacing it with something that allows one to play with the output buffering strategy. It is also possible that foo_out_upnp does not use http streaming at all but something entirely different based on the info in the actual service description. It's not open source. Wireshark sniffer traces would show the differences.

stewartsiu commented 12 months ago

If http is the problem, here's another potential hint: disable_chunk_encoding does not work with the current line as follows: stream_size, chunk_threshold = Some(usize:MAX-1), usize:MAX But it works when I change it to: stream_size, chunk_threshold = Some(usize:MAX-1), usize:MAX-1 or stream_size, chunk_threshold = Some(usize:MAX), usize:MAX

Anyway, will try Wireshark and report back when I have some free time.

dheijl commented 12 months ago

I'll change it to Some(usize:MAX), usize:MAX then, as it makes no difference in my setup, all three work here.

What you could do is use BubbleUPNP server as a proxy for the Gustard. You add the Gustard as an Openhome renderer in the BubbleUPNP GUI, and you stream from swyh-rs to the newly added (local) renderer, maybe it will solve your problems.

stewartsiu commented 12 months ago

I managed to do a quick Wireshark capture of the foobar output with foo_out_upnp playing a local file at 16/48: testfoo.zip

One thing I noticed in the capture is that there is a clear GET message from the renderer (192.168.0.11) asking for the stream.wav file, but if I use swyh-rs to play a local file I don't see a similar line asking for swyh.wav even though the sound still comes out. [Edit: Also if I look at the data going from my PC to the renderer, foo_out_upnp uses TCP while swyh-rs uses VNC]

dheijl commented 11 months ago

Thanks, could you now attach a sniffer trace of a swyh-rs session too ? I don't see any major differences in the trace, regarding streaming except that foo uses a 32 bit usize::max for the streamsize while swyh-rs uses a 64 bit usize:max.

swyh-rs uses the same tcp as foo_out_pnp, but the server port number 5901 that swyh-rs uses is also used by VNC, and that makes wireshark think that it's looking at a VNC session.

Why you didn't see the GET request beats me, but if it isn't there you won't get sound from swyh-rs.

stewartsiu commented 11 months ago

Here you are: testswyh.zip

stewartsiu commented 11 months ago

I changed the port to 5902 to avoid the confusion with VNC, and now the GET swyh.wav line is shown: testswyh_5902.zip

Two diffs I see: The swyhrs session doesn't talk to the renderconnmgr1 before rendertransport1, and all the TCP transmission of audio data have conversation completeness as Incomplete(15), vs Complete(47) in testfoo, which means a reset(32) is missing after data if i understand correctly.

dheijl commented 11 months ago

I see that chunked encoding is still being used. The current version (1.8.5) no longer allows chunked encoding as it is nowadays considered a largely useless http 1.1 feature, and has been removed removed from http 2. But streaming obviously works without problems.

Incomplete/Complete: an endless audio stream can never be complete, swyh-rs answers the get request with an endless HTTP stream, and HTTP has no concept of "conversation completeness", that's a TCP feature. This use of tcp-flags may be caused by the Rust HTTP/TCP libraries, but has never caused any problem so far.

Edit: it seems to be an interpretation by Wireshark, that has no real meaning for the TCP stream:

TCP Conversation Completeness

TCP conversations are said to be complete when they have both opening and closing handshakes, independently of any data transfer. However, we might be interested in identifying complete conversations with some data sent, and we are using the following bit values to build a filter value on the tcp.completeness field :

1 : SYN 2 : SYN-ACK 4 : ACK 8 : DATA 16 : FIN 32 : RST For example, a conversation containing only a three-way handshake will be found with the filter 'tcp.completeness==7' (1+2+4) while a complete conversation with data transfer will be found with a longer filter as closing a connection can be associated with FIN or RST packets, or even both : 'tcp.completeness==31 or tcp.completeness==47 or tcp.completeness==63'

Another way to select specific conversation values is to filter on the tcp.completeness.str field. Thus, 'tcp.completeness.str matches "(R.|F)[^D]ASS"' will find all 'Complete, NO_DATA' conversations, while the 'Complete, WITH_DATA' ones will be found with 'tcp.completeness.str matches "(R.|F)DASS"'.

dheijl commented 11 months ago

The only thing that could be related to the stuttering that I can see in the trace:

So perhaps the 8 KB chunks are a problem?

Have you tried 1.8.5 yet that is supposed not to use chunking?

Edit: apparently tiny-http decides to use chunking anyway, regardless of specifying it or not...

I might have to get rid of tiny-http altogether.

dheijl commented 11 months ago

The current code in master prevents http-tiny from activating chunked transfer.

Does it change anything with regard to stuttering?

stewartsiu commented 11 months ago

testswyh_1.8.6.zip Just ran v1.8.6 from cli, streaming didn't start at all and I got "streaming to .... has ended" immediately. Wireshark capture attached.

dheijl commented 11 months ago

I seem to have broken WAV, I only use FLAC myself.

dheijl commented 11 months ago

WAV works again here, I replaced 1.8.6 with the fixed version.

stewartsiu commented 11 months ago

Neither wav nor flac works... Capture after latest pull: testswyh_flac.zip

stewartsiu commented 11 months ago

Got it to work by changing usize:MAX - 1 to usize:MAX, but the stuttering behavior is the same (I triggered stuttering by starting, stopping by Ctrl-C and starting again). Here is a capture with stuttering: testswyh_stutter.zip

dheijl commented 11 months ago

with usize:MAX you have enabled chunking again.

stewartsiu commented 11 months ago

how about MAX/2?

omoknen commented 11 months ago

I wanted to give some information because the thread is very long. I am streaming at 44.1khz/16bit and I don't have any problems. I am not sure why 48khz is a problem, but before the first update 1.8.6 I did not have a problem, and I don't have a problem after updating the first 1.8.6.

44.1Khz sampling with 16 bit is working using WAV does work for me with Sonos speakers with Inject silence using Windows 10, VB-Audio Visual Cable, MusicBee Audio Player with WASPI (shared) and output to VB-Audio Visual Cable.

dheijl commented 11 months ago

I'll try to experiment with some different values. But MAX - 2 broke WAV with mpd for some reason...

stewartsiu commented 11 months ago

Where do you read whether chunking is enabled from the sniffer trace?

dheijl commented 11 months ago

The data part of the frame starts with the chunk length encoded as ascii hex digits (8192 = 0x32 0x30 0x30 0x30) followed by 0x0d 0x0a, and ends with 0x0d 0a.

stewartsiu commented 11 months ago

Thanks! I didn't get the chance to test further as I decided to return the streamer, at least for now... If there's a way to reproduce the reliability of foobar output I'll probably buy it again.

dheijl commented 11 months ago

I'm really sorry to hear that. Anyway, I have learned that the content length header value can break streaming, I wasn't aware of that and will investigate further. Thanks for your help in trying to fix it.