Closed tadly closed 5 years ago
Been playing with this a little on my setup even though I only get 2 channels (for now) and I'm really really liking this. Latency is really low and quality is better than with my crappy usb soundcard <-> line-in setup.
Whatever is necessary to get 5.1 working, I'm willing to help as much as I possibly can.
Hi @tadly, I'm pincopallinux from Reddit. As mentioned in our conversation there I'll try to make a 5.1 patch this weekend. I'll report the progress here and on Reddit so maybe it can be upstreamed or @duncanthrax can help if he think this is useful.
Hi guys, it's definitely possible to set up 5.1 topology. I'm not sure about the details however. It might also be necessary to reorder things on the receiver, since windows may use different interleaving than pulse or alsa for six channels. Also packet and buffer sizes need to be adjusted.
@martinellimarco : Let me know your findings, I don't have much time for scream at the moment, but maybe I can give some hints.
@martinellimarco Hey and I think I misunderstood you on reddit then... Didn't expect you to straight up work on this but am very thankful of course! :)
Again, if there's a way for me to help I gladly will but my knowledge (and lately also time) is rather limited.
I'll try to educate myself as much as possible in the meantime :)
@duncanthrax thanks for chiming in. Just to get this out the way. Do you think this will have a big impact on performance/latency?
As mentioned before, on my KVM <-> Host setup it's Lip-Sync-Perfect with 2 channels. This x3 is quite something though and has me worried just a little
@tadly I think that the required bandwidth is low enough to stream it without problems on most systems. At 48kHz, 32bits per sample, 6 channels, the bitrate is 48000 32 6 / 1024 / 1024 = 8.9 MiB/s VirtIO net should be able to deal with this without problems. At 192kHz it's 35.2 MiB/s, it's not low but it should be still manageable.
@duncanthrax thank you for pointing out about possible interleaving differences, I didn't thought about that before.
@martinellimarco thanks for doing the maths.. I'm even okay with 16 bit which would be 2.5 MiB/s. Sounds even more doable.
In regards to the interleaving, I think issue #10 is related maybe?
@tadly issue #10 is something to keep in consideration, I will test this behaviour, thank you.
@tadly I've setup a machine with VS2017 and WDK and I've started working on it. So far it seems I'm on the right track: screenshot
There is still a lot of work to do of course, this is just a few lines patch to enable support from formats from Mono up to 7.1 in the miniport. I can hear the test playing but it's totally distorted, as one would expect.
@duncanthrax do you know if there is a way to sign the driver for free? This is the first time I work on a windows driver and I'm not sure about the signing procedure. It seems to me that I have to purchase a certificate, but... let's just say it doesn't seems to be cheap at all. At the moment I've set windows in test mode and it's working, but it's not a viable solution for distribution. Any advice here? I hope I'm missing an obvious solution.
@martinellimarco There are two signing stages:
@duncanthrax thank you for the infos, Microsoft certainly doesn't want to make thing simple for us
Anyway, apparently it took me way less than expected to reach a functional prototype. screenshot
What I did for now is to extend the header from 2 to 5 byte. The 1st and 2nd byte retain the same meaning. The 3rd one is the number of channels, the last 2 are the speaker configurations, as defined in the documentation here
I realize that this will require an update to all the receivers, but I don't see another way. What do you think @duncanthrax ?
At the moment I've patched scream-pulse to accept this new format. For now it only cares about the 3rd byte, and it change the number of channels accordingly. It's working fine.
In my screenshot, in the bottom-left corner, you can see in the terminal that I'm testing at 96kHz, 32bit per sample, 6 channels. I can hear music from my windows VM without crackling. I've tested it at 192kHz and there I can hear some occasional mutes, like there is no sound for a millisecond, it's not crackling or popping. I think it may be a problem with the buffers size, I've not touched them yet. I'll investigate it later.
Now, the next step is to see if I can use the speaker configurations bytes (the last 2 in the new header) to have pulseaudio automatically map the same channels that windows use. Maybe this is not even necessary, I have yet to explore this. If I'm lucky pulseaudio and windows shares a common speaker mapping and I can get rid of those extra bytes, but for now I don't think that's the case. In the screenshot I've posted the names of the 6 channels in pavucontrol are not matching the windows configuration.
@tadly I'll prepare a build for you to test if not tomorrow then on friday. As I've mentioned to you on Reddit I can only test in stereo at the moment. I'll try to get my hands on a 5.1 or 7.1 set because it will be very hard to test individual channels otherwhise. Pulseaudio is doing a terrible job downmixing 5.1 to stereo for me, the audio in not balanced, but it's not a problem with this driver, I recall I had the same issue watching DVDs few years ago, I never bothered to investigate it. Another problem is that despite my soundcard being capable of 7.1 audio pulseaudio refuses to allow me to setup that. In windows I can set it, pulse-scream receive the right parameters, but then pulse fails. I can still receive the packts with wireshark and they seems to be fine, I'm convinced this problem is due to pulseaudio or my soundcard, and it's not related to this work, but testing on other systems will be good.
@martinellimarco You said you'd work on this on the weekend. I actually feel betrayed xD
But in all seriousness. This is amazing and your progress is incredible. As promised I'll obviously help testing but for the time being, maybe pavumeter can help you figure out if the channel map is correct or not just so I don't slow you down :)
It's basically just an indicator for what channel is outputting audio. Coupled with the windows test tone or online test sounds this should be easy to figure out. Not sure if it measures before or after downmixing though 🤔
In regards to the speaker map (channel map as it's called by pulse).
I know that for both alsa and pulse the map can be configured but doing so - to match windows in case it differs - probably breaks local applications?!
I honestly don't know but it would make sense if it would.
As such I think it's quite alright to have the header carry 5 bytes as it's a clean implementation that way. @duncanthrax will have to call the shots though :)
@tadly I'm sorry I've betrayed you :) My intention was to setup the IDE and try to compile the driver to see if I was able to build it, one hour of work at most... but while I was waiting for VS to install I started reading the code and... you know, when all the pieces start to align in your mind and you have to write it down. And that's the story of how I ended up coding at 3am with a meeting at 8am this morning xD The meeting went exceptionally well, maybe I should do this more often.
Anyway, I've tested pavumeter (thank you! I didn't know it). It does work after the downmixing but if I setup the card profile in 5.1 it works fine.
The test revealed that at least on my system scream-pulse output 6 channels mapped as front left, front center-left, center, front right, front center-right, rear-center. (I think those are the names in english, my setup is in italian).
Those channels are not the same that windows output, but I already knew that.
Pavumeter shows the same channels that I see in your screenshot. They are not the same that scream-pulse have.
Pulse here does the right thing, it match the channels from scream-pulse to the right channels of the card doing mixing when necessary. When scream-pulse send a signal on the front left channel it's mapped correctly. When the signal comes from the front center-left channel pulse mix the signal in the front-left and in the center channel.
In practice what i have to do is to have scream-pulse match the windows setup so that the channel positions match. Pulseaudio will handle the rest doing mixing if necessary.
I've tested also with other windows configurations keeping my soundcard in 5.1.
Windows in mono -> pulseaudio mix the signal to every channel. Windows in stereo -> pulseaudio mix the left channel to front-left, rear-left and center and the right channel to front-right, rear-right and center With more than 2 channels it still does the right mixing between scream-pulse and the card profile but the mapping between windows and scream-pulse is wrong so what I hear is wrong.
It's time to investigate pulseaudio channel mapping :)
EDIT: I forgot to mention, the reason I was experiencing terrible downmix from 5.1 to stereo is now obvious. It's due do the mismatch between windows and pulse-scream channel mapping. Pulseaudio is doing it right.
Glad the meeting went well. Going to bed while feeling good about achievements etc. usually causes better deep sleep phases I think :)
Apart from that I don't have a lot more to add I think. Glad pavumeter was at least somewhat useful and if I happen to think of anything else that might be useful, I'll obviously share my findings :)
I've looked into pulseaudio api, it's very well documented. With just a few line of codes I was able to manually map one of the windows layout in scream-pulse and it's working perfectly in pulse :) I don't have time to work further on this today, maybe this evening or this friday. The last piece of the puzzle is to figure out a way to map this automatically instead of manually, but I think I have all the necessary informations. I'll keep you updated as soon as there are other progress.
Oh, I've convinced a friend to let me borrow his 5.1 setup. I'll be able to do more accurate tests this way.
Wow. You're really going the extra mile on this. Do know that I really appreciate this a lot
I've made the mapping function from windows to pulse and I think it's working fine, at least that's what I see with pavumeter.
The function takes the bitmap used in windows and set the exact same configuration in pulse so even if in windows one have some exotic configuration it should work just fine.
I see that both windows and pulse have support for a second set of surround speaker named top_center, top_front_left, top_front_right, top_back_left, top_back_center and top_back_right. At the moment I'm not mapping these ones since they are not used in any configuration I can see (from mono to 7.1).
Do any of you know if there are setup where these top speaker positions are used? Is it worth to support them? In theory I only need to add a 6th byte in the header to account for a more lengthly bitmap and a few lines of code in scream-pulse to map those positions to pulseaudio. It will take 10 minutes to add support for them, but I have no way to test the result.
In windows I don't see these positions used in any DirectSound speaker configurations, maybe they are never used in practice? I don't know.
Anyway, I'll clean the code a bit and I'll publish it on github soon. I hope it'll pass the tests.
To answer myself, those speaker positions are used in systems with 11 to 18 channels, I think it's not worth to support them. If anyone is interested ping me and I'll work on it.
I've published the code here so you guys can inspect it and test it.
@tadly I've prepared a binary build of the windows driver for you to test, you can get it here. The link will expire in 7 days.
To install it you'll need to open a terminal as administrator and run bcdedit.exe -set TESTSIGNING ON
then reboot the system.
This will put windows in test mode and will allow you to install the unsigned driver.
To install the driver run the install.bat script as administrator.
On linux you'll have to build scream-pulse. Is that ok or do I need to compile it for you?
Let me know the results on your system. I'll have access to a real 5.1 setup only on friday.
I just love how your initial statement was to only work on this on the weekend and now it's Thursday and you're basically done :D
The additional channels sound like its for dolby atmos stuff?
A 7.1.4 Dolby Atmos system is a traditional 7.1 layout with four overhead or Dolby Atmos enabled speakers.
And from the dolby website:
...adapts the cinema experience to your home theater from seven speakers to as many as 34...
I love feature completion but think supporting up to 7.1 is enough.
I'll make sure to somehow find the time to test it this evening. Linux build is no problem (especially as I'm on arch) :)
On a side-note, remember how you said:
Technically it's even possible to take Scream as a base, add 5.1 support and write the packet in SHM instead that on a socket.
Now that you've familiarized yourself with the source a little, you still think this is possible? I was thinking of making another issue for this just so it'll not get forgotten as I think this would give us the best possible performance. (I've had a few desync issues using scream with 2 channels already and the resolution was to make sure the client buffer cleared itself -> restart the client).
I didn't expect to be at this point already, but it was really a fun project and thanks to scream being well written and pulseaudio well documented it was much easier than expected.
There is still much work to do though:
To discuss about IVSHMEM or VirtIO I've opened a new issue, #36 , let's discuss there.
@martinellimarco : Great work!
@martinellimarco Way ahead of you. Check OP on reddit again ;P Given the amount of upvotes, I still don't think it's very visible though 🙄
Edit: Just wanted to say that I have a big screaming smile on my face even more so after @duncanthrax saying he's okay with merging SHM support.
Qemu audio has always been an issue and since 2015 they've had plans to work on it for gsoc without finding any adopters. Even so there's actually a dated fork adding 5.1 audio support which is likely never going to get PR'ed in the first place.
So again, @martinellimarco thank you so so so much for working on this!
@duncanthrax thank you!
I completely agree with you that the default for scream should be the networked one and the SHM / networked code should not mix.
Since it seems to me that you are willing to upstream a patch for this I'll remove the issue I've opened in my fork and reopen it here.
For the receivers, I'll follow your advice for now and I'll add support for the new headers with a warning if the user is using a non supported configuration. I'll revisit them later to add full support for multichannel if possible.
@tadly Thank you, I didn't see the updated post on Reddit.
I've been able to do some quick remote testing (no hearing only looking with pavumeter) and noticed the following.
Playing the Windows Test Sounds for 5.1:
Also, using the multichannel online test everything goes absolutely haywire (visually in pavumeter) :D
More proper testing this evening :)
Can you check if pavumeter and pavucontrol have the same channels? If you see the same output on two channels in pavumeter it means that the scream-pulse receiver have a different mapping than your soundcard profile and pulse is mixing internally to map the two. In windows select different profiles until you see that scream-pulse have the same channels that pavumeter shows. If you can attach a couple of screenshots of pavumeter, pavucontrol and windows settings.
Thanks for the multichannel online test, I'll try it tomorrow.
Aha! I did indeed mess up the windows configuration. I did configure 5.1 with side-pair speakers (because that's more or less where they physically are) and expected it to treat them like a rear pair (given that it's named 5.1)
Using the config for rear fixes the issue I was having :)
I'm soooo excited playing some Skyrim this weekend using scream and looking glass!!! :D
Glad to see it's working :) May I ask what desktop environment are you using?
Even created a isolated net so it doesn't have to go through the router in the hopes I'm not getting random desyncs. Later tests (when I'm actually at home) will show.
Sure you can but you won't like it because it's not really a DE. It's i3 with polybar and rofi (Not on the screenshot) and some scripts / tools I wrote for myself (also not on the screenshot)
Why do you think I'll not like it? I think it's very neat, I'll try a similar setup one day or another.
Back on topic, an isolated net between host and guest is a good solution I think. That's what I'm always using in my VMs and that's what I've used during my tests.
I've started working on scream-alsa. The new protocol is now supported but channel mapping is not as easy as with pulse. I'm also experiencing underrun proportionally with the number of channels, but I have yet to investigate the cause.
Off topic: My bad. Didn't specify what you're not going to like which is... Those kind of setups have a steep learning curve (if you're new to it) and require a lot of tinkering which you're also never going to stop :D Pure WM setups are usually used by people who want to be as efficient as possible.
On topic: Audible sound tests for both windows test sounds and some web samples are on-point and clear!
Longtime test I might be able to do today/tomorrow. Sorry I couldn't do more testing :/
Wait, the alsa receiver can run through pulse? That... seems wrong. Typically, if someone says "alsa" they mean there's no layer on top so in your case you'd have to make sure the pulse daemon isn't running. In regards to the underruns, did you read the usage section of the README? Maybe that'll help?
I'm glad it's working :) Today I'll be able to test it on a real 5.1 setup, I'm really exited to see if it works well.
I get your point on the alsa receiver but I think it's worth pointing out that when we refer to alsa we refer to two different piece of software. One is alsa as in the kernel space drivers, one is alsa as in the user space library (libasound).
The scream-alsa receiver doesn't contain any pulseaudio code, it's linked to libasound.
What's happening here is that on my system pulseaudio provides the "pulse" virtual device for alsa.
All the streams from applications that use alsa (user space) are routed through pulseaudio to be mixed and processed and then are routed back to alsa (user space) that send them to alsa (kernel space).
It's explained in better terms here: PulseAudio under the hood
I'll make sure to test it extensively without pulseaudio messing around, but it should not make a difference.
Thank you for pointing me at the README. I have to admit I have not read it haha :) With -t 100
it's perfect.
I've pushed in my fork scream-alsa with multichannel support. Please use -v
when testing, it will output the channel map in plain text.
I've also updated scream-pulse but it's just a code cleanup with minor adjustement.
I've tested both receivers for a few hours each and the results are good. I'm experiencing problems at 192kHz 32bit 8 channels, all the other configurations seems to work, at least for me. I'll have to look into it, maybe it's just the buffer size that need to be adjusted, but I want to finish the other receivers first.
I've patched and pushed scream-raw too.
@martinellimarco : When you're done with the code, please send a PR. I can take care of the docs if you want. Thanks for your work!
I will :) I'm looking into the windows receiver now (I've started a couple of minutes ago). I hope I can get it ready soon.
If so I'll continue testing during the weekend and I'll send the PR next week.
Thank you for taking care of the docs, my english is not always the best.
I've patched the windows receiver too to support the new header and multiple channels, but not channel mapping. I can't figure out how to do ti with NAudio. In practice if both windows machine are set with the same speakers position this doesn't matter.
Maybe we can write it in the doc and someone with more knowledge about NAudio can patch it later?
While I was at it I've also fixed a trivial null pointer exception.
Another user on Reddit, yestaes, reported positive results with minor problems only at high rates (96kHz, 32bit, 6 channels)
I've increased the number of chunks in the driver and now the occasional hiccups are gone, except sometimes at 192kHz, 32bit, 8 channels.
Thats ~50Mbps and from windows task manager I can see that the speed dropped at the same time that I had high disk usage. This particular VM is on a mechanical hard drive, it's not the fastest thing in the world. I think I'm just hitting the limit of my particular system.
Anyway, I've uploaded a new version of the driver here for @tadly or others to test.
Soon I'll send the PR.
Impressive effort, thanks Marco! Greetings from Germany, /tom
Thanks to you Tom, your project was literally a game changer for those of us that use KVM. Contributing to it was the minimum that I could do. Greetings from Italy! :)
Oh good god this went faster than I was expecting. Sorry for getting this quite the last days.
As is obvious also I can only report good things so far.
@martinellimarco I just wanted to thank you again from the bottom of my heart for all you've done. This whole thing went by far better than anything I could've ever hoped for!
Thank you @tadly :) Next step, IVSHMEM support. I've started yesterday and I have the general idea of what I have to do, but I still have a lot of details to figure out. Unfortunately in the next weeks I'll be very busy with my work and I don't expect to have much time for this, but I'll work on it. I'll report any update in #36 , I hope you'll have time to test this too :)
I'm not sure if this would be possible but asking doesn't hurt I guess/hope.
I'm (again) searching for a way to transfer 5.1 audio from windows to linux (as close to real-time as possible). As this is a VM Guest <-> Host setup, I should be able to bring network latency down quite a bit fingers crossed
Given that scream just dumps raw PCM data, and PCM can hold more than 2 channels it should be possible right? It would be just a matter of the virtual device driver to support configuration of more than 2 channels.
I'm obviously just guessing though :)