dotnet / iot

This repo includes .NET Core implementations for various IoT boards, chips, displays and PCBs.
MIT License
2.16k stars 582 forks source link

Library performance is significantly different between drivers/operating systems #1004

Closed pgrawehr closed 3 years ago

pgrawehr commented 4 years ago

Describe the bug

This came up while testing #914. It looks like the Windows tests are generally significantly slower than the linux tests. Generally, some operations are a lot slower than what they should be. For instance, the relatively simple PinValueReadAndWrite test takes between 1ms and 12ms (depending on driver) on linux and 23ms on Windows. It should not take more than 300 cycles, so 1ms is acceptable, everything else probably not. The test FastInterruptHandling even takes 3800ms on Windows, while it takes the expected 2100 on Linux (all drivers).

Steps to reproduce

Look at the build logs (two attached)

Expected behavior

Actual behavior

See description. Timing is greatly dependent on OS and driver.

Versions used

The comparison was done with the test results from the Helix pipeline.

Note: Attached files are xml, but github doesn't allow that directly.

testResults - Windows.xml.txt testResults - Linux.xml.txt

joperezr commented 4 years ago

Thanks for raising this @pgrawehr, we should do some profiling to figure out where the perf diferences are, but it is definitely relevant to point out that Windows implementation is completely different than Linux, in fact, this repo mainly only contains the Linux implementation as the Windows implementation pretty much PInvokes into WinRT calls that will use OS Drivers to manage GPIO, SPI and I2c.

pgrawehr commented 4 years ago

I'm completely aware of this, and I know that PInvoke calls may be expensive, but something in the order of 10ms for a call (from a rough estimate) would be really bad.

If the result of the investigation is that this is due to some OS limitation, I'm fine with that (maybe document it), but at this time, it looks like something is fishy. I have seen quite a few questionable Sleep()'s while investigating the interrupt performance issues.

joperezr commented 4 years ago

Totally agree, I have put this in our backlog for vNext as something to profile and investigate to see if there are any areas where we can do a better job in WIndows

Ellerbach commented 4 years ago

@pgrawehr if you are looking for something on a real Windows (not Iot), a normal 32/64 bit version, you may have a look at using an FT4222, see https://github.com/dotnet/iot/tree/master/src/devices/Ft4222. Performance seems all ok, now there are couple of limitations because it's working over USB, but it really works well. Support for MacOS is coming as well. There was a bug in the FTDI binaries which is fixed now. I'm just waiting for them to publish the fix to update the documentation. On the limitation side, if you're interested in GPIO, all sensors in the repo detect automatically the driver. As FT4222 does not detect automatically, you'll have to adjust the source code of the sensor you're interested in to pass the GPIO driver at contruction. Feedbacks welcome if you're interested in this approach as well.

pgrawehr commented 4 years ago

@Ellerbach I don't currently need that, I just observed this while checking the run times of the different builds that currently run on the helix. That we're having some API inconsistencies with the lifetime of objects (especially GpioController instances) is a fact. That part is being addressed (hopefully) with #878. I'm a bit busy now, but I'm intending to write a draft for that (maybe I'm forced to stay home soon anyway).

krwq commented 4 years ago

Sounds to me like the action here is to simply document the differences, correct?

pgrawehr commented 4 years ago

Possibly, yes. Since we know for instance that DHT sensors don't directly work on Windows (because switching the pin mode is to slow), this could be related. The question would be whether the low-level Windows GPIO driver could be improved.

Ellerbach commented 3 years ago

[Triage] Closing as there is nothing we can improve on the managed code side.