Closed sippeangelo closed 3 years ago
Huh. Well that is definitely a problem. So you start scanning, wait, then stop scanning, and it freaks out after a while? Any idea how long that takes on average?
I haven't been able to time it when it starts running away, but I'm estimating it happened after an hour or so of continuous scanning with 5 second delays. However, I created this repro which on loop makes the server memory usage rise very fast from the very start, without ever freeing anything:
>IntifaceCLI.exe --version
Intiface Server, starting up with stdout output.
Intiface CLI (Rust Edition) Version 19.0.0, Commit f6e7a92, Built 2020-12-12T06:40:34.966282200+00:00
using Buttplug.Client;
using Buttplug.Client.Connectors.WebsocketConnector;
using System.Threading.Tasks;
namespace Test
{
class Program
{
static void Main(string[] args){
doit().Wait();
}
static async Task doit()
{
var client = new ButtplugClient("Example Client", new ButtplugWebsocketConnector(new System.Uri("ws://localhost:12345")));
await client.ConnectAsync();
int i = 1;
while (true) {
await client.StartScanningAsync();
System.Console.WriteLine($"Started Scanning ({i++})");
await Task.Delay(500);
await client.StopScanningAsync();
System.Console.WriteLine("Stopped Scanning");
await Task.Delay(500);
}
}
}
}
Thanks for the STR code!
I built a quick version of this in rust and it does the same thing, so this is either in buttplug-rs or btleplug. Once I can identify that I'll move the bug over to the relevant repo.
Now, I wouldn't really recommend a scanning loop like this in the first place (I need to build in "timed" scanners for things that don't advertise like Bluetooth, but that'll be post v1), but even if someone does, this shouldn't leak like this. I'll poke around and see if I can figure out what it is, my best guess is that some UWP windows structure isn't getting freed in btleplug. :|
Welp, I was wrong, it's a bug in buttplug-rs. Knew using Broadcaster would bite me at some point. See the referenced bug for more info.
This is going to be difficult to fix, but obviously needs to happen. I'll see what I can do. I'll leave this bug open until I get this fixed, but I'll be doing most of the reporting at buttplugio/buttplug-rs#226.
Ok, well, replacing this with tokio::sync::broadcaster seems to have done the trick. We go from exponential growth to fairly stable usage, when combined with checking what type of event we're getting.
The thing that seemed to be triggering this in the first place was btleplug's DeviceUpdated event, which is a vestige of rumble that fires on ANY advertisement. Due to the stream conversion and what not in buttplug to turn this async, we were generating tons of clones.
Thanks for looking into this. This sounds like good news!
Now, I wouldn't really recommend a scanning loop like this in the first place (I need to build in "timed" scanners for things that don't advertise like Bluetooth, but that'll be post v1)
Yeah I'm a bit unclear on how the scanning is supposed to be used. There's the "scan finished" event that I expected to be sent after starting a scan once the server decides it was done scanning, judging by the examples, but it never seemed to fire and scanning went on indefinitely. Since I need the connection process to be as automatic as possible, I used this to start a scan and never stop it. But this had a high risk of causing https://github.com/buttplugio/buttplug-rs/issues/215, so I switched over to briefly starting and stopping the scan in a loop. 🤷
Yeah I'm a bit unclear on how the scanning is supposed to be used. There's the "scan finished" event that I expected to be sent after starting a scan once the server decides it was done scanning, judging by the examples, but it never seemed to fire and scanning went on indefinitely.
Yeah this is unfortunately part of the complexity of being a cross platform library. :(
On desktop, "scanning finished" shouldn't really mean much. Scanning should go until you tell it to stop. Right now that's not the case in rust because I got lazy and made things like the xinput scanner only scan once then return, versus doing repeated scans like they did in the old C# library. I need to fix that.
However, there are systems like WebBluetooth where we don't get a say in scanning time. In WebBluetooth, the browser brings up a dialog to handle scanning, during which the webpage loses focus. We then fire "ScanningFinished" after that to let the developer know that scanning is done whether they want it to be or not.
So it's really a thing of usage context. I'm still figuring out exactly how to message this, will probably put something about it in the developer guide.
Ok, managed to clamp memory usage and stop spawning tons of useless tasks in our btleplug handler. Memory and CPU graphs look so much better for this specific instance. I'm gonna try to merge and land this tonight, though I've got a bit more testing to do first just to make sure I'm not gonna break the world when I do it. I think this may end up fixing some other crash issues we've seen lately though, so def excited to get this in. Will close this after I ship a new CLI.
Thanks again for the report, and I'd be interested in hearing more about your VRC work. We're getting a LOT of VRC users on the discord, mostly with XXXHaptics, so I'm curious what alternatives are out there.
Ok, v20 on buttplug-rs 0.11.3 is building now, should hit releases in the next 10 minutes or so. This ran pretty well when looping start/stop scanning in 500ms loops, so I imagine 5s should be fine. I also added extra checking so we don't try to constantly reconnect to devices we've already connected to in previous scanning loops.
Still not sure I can recommend scanning in a loop like that, as you're probably mostly interested in bluetooth, and that should scan until you tell it to stop, but even if you do keep up your current method, it should work now. Definitely feel free to file more issues if you continue to run into other problems. I've got a lot more optimization and cleanup to do in relation to what was involved in this patch, but since I'm not seeing anything quite as dire, I'm gonna do that work post v1 (since v1 is already like 4 months late).
This is wonderful news! Thanks a lot! I can confirm it now looks stable on my end too 👌
The issues I was having with indefinite scanning seemed very similar to https://github.com/buttplugio/buttplug-rs/issues/215, where devices would refuse to reconnect until the server was restarted. I'm not sure how I could have triggered that however, since I never disconnect in the first place, so maybe it's a different unrelated issue. I'll revert back and do some testing later!
Thanks again for the report, and I'd be interested in hearing more about your VRC work. We're getting a LOT of VRC users on the discord, mostly with XXXHaptics, so I'm curious what alternatives are out there.
No probs, thank YOU for this amazing suite of tools! Yeah, I wasn't very impressed with the jank of XXXHaptics so I wrote up my own version for those not afraid of modding their game. It's on Gitlab if you wanna check it out! https://gitlab.com/jacefax/vibegoesbrrr
It's pretty basic so far since I've been trying to iron out a lot of stability issues first, but it's already pretty impressive in how much immersion it adds to VR... erm... interactions. The low latency is very impressive too, compared to fiddling with a mobile app on the side or even XXXHaptics!
A lot of the stability issues I've been getting reports of seems mostly to be related to general Bluetooth and cheap Bluetooth dongle jankiness, which is a sad state of affairs. The dream would be to have an Android version of Intiface to connect to, since phone Bluetooth connections seem decades ahead of anything Windows is capable of!
This is wonderful news! Thanks a lot! I can confirm it now looks stable on my end too 👌
The issues I was having with indefinite scanning seemed very similar to buttplugio/buttplug-rs#215, where devices would refuse to reconnect until the server was restarted. I'm not sure how I could have triggered that however, since I never disconnect in the first place, so maybe it's a different unrelated issue. I'll revert back and do some testing later!
Yeah that's definitely a possibility. There's some really weird caching in scanning to make sure we don't try to reconnect to devices we don't know, or devices we're already connected to, and there's a big possibility for bugs there. Since this is sitting on top of a hardware layer I haven't built mocks for yet, it's also super hard to test, so I'm kinda relying on test-in-production reports to figure this stuff out right now. Getting sentry crash/log submission back into the CLI is high priority, that'll help a ton.
No probs, thank YOU for this amazing suite of tools! Yeah, I wasn't very impressed with the jank of XXXHaptics so I wrote up my own version for those not afraid of modding their game. It's on Gitlab if you wanna check it out! https://gitlab.com/jacefax/vibegoesbrrr
Cool, I'll check it out! Though most of our users are super worried about getting kicked off VRC which is why I think XXXHaptics has the momentum it does. God I hate VRC's locked down system. :|
A lot of the stability issues I've been getting reports of seems mostly to be related to general Bluetooth and cheap Bluetooth dongle jankiness, which is a sad state of affairs. The dream would be to have an Android version of Intiface to connect to, since phone Bluetooth connections seem decades ahead of anything Windows is capable of!
A quick tip you can give your users: Put the bluetooth dongle on a USB extension cable and put it near your body. A lot of people leave the dongle plugged into their machine, which causes line-of-sight interference that can really escalate latency. I usually recommend this dongle too, it's not anything special, I just know it works 'cause I do my tests with it: https://www.amazon.com/Plugable-Bluetooth-Adapter-Raspberry-Compatible/dp/B009ZIILLI
While I'm definitely planning on mobile support (only thing holding that up is getting Bluetooth FFI working on Android, but that is a TALL order thanks to fucking Java), Android connections aren't gonna be a ton better, it's more likely that there's just less occlusion. Radios in Android phones are extremely variable and usually not great either. Windows can usually throw may more power through the radio since it's either not battery driven or driven off much larger batteries.
Basically, no matter what the platform, we're all fighting against the shitty magic of bad radios here. :(
BTW, if you're interested, we've got a discord at https://discord.buttplug.io with a couple of other VRC mod atuhors on it.
It doesn't happen linearly, rather it looks fine with memory usage in the 10's of MB for a long time until it suddenly spirals and grows. It seems to demand a lot of CPU time as well as this is happening.
another instance
I'm not sure if this is what causes it to happen, but I run scanning in a 5 second loop like this from the C# library, and if I increase the timeout to 60 seconds instead of 5 it doesn't seem to spiral out of control. Either that or the effect is just delayed by 12 times so I haven't had a chance to experience it...