NordicSemiconductor / pc-nrfconnect-ppk

Power Profiler app for nRF Connect for Desktop
Other
110 stars 32 forks source link

Time scale way off #318

Open sheggy012 opened 1 year ago

sheggy012 commented 1 year ago

Bug #141 is still present. See https://devzone.nordicsemi.com/f/nordic-q-a/95909/ppk2-and-power-profiler-desktop-time-scale-way-off/418883 . Makes it a pain to work with.

aadnekar commented 1 year ago

Hello @sheggy012, I can see from the original post on Devzone, that the problemed seemed to persist while connecting directly to the computer, but it seems to work when the PPK2 is connected to a hub. Have you been able to test with a USB hub, and in that case, does the issue still persist? That is very unfortunate, and I will bring it up with the team after easter, to see if there's anything we can do about it. Thanks for reporting.

sheggy012 commented 1 year ago

Hi aadnekar, yes exactly, as long as you put a USB hab in between, the time scale works. So there is at least a working workaround. Thank you for bringing it up. Maybe it's possible to solve it for direct connection.

By the time I realized that the time scale was wrong, far too much time had passed. I have already started to exchange the oscillators on my PCB's because I suspected a fault there. 😄

aadnekar commented 1 year ago

I'm sorry to hear that @sheggy012, I will definetly investigate if we can fix this for direct connection. I will update this issue when I get more information.

RenanSP13 commented 1 year ago

I have the very same issue: the recorded time is only 32% of real time. I tried multiple USB ports (2.0 / 3.1), cables, OS (Windows 10 and Linux Pop!_OS 22.04) and a USB hub. The best combination still didn't go over 34%. This is really messing with my calculations.

My motherboard is a "MSI - AMD AM4 mATX B350M gaming pro" (chipset = B350M). I will see if there are BIOS updates available and report back.

Reporting back: I updated to the latest BIOS (2023-05-23-beta) and the problem persists.

RenanSP13 commented 1 year ago

But after tweaking many options on BIOS (most of which I don't understand) it improved from 32% to 48% of real time being captured. That also improved a bit the ringing that the PPK2 was displaying on both the current graph and the LA graph (the signal was fine as I probed it with an oscilloscope right at the PPK2 PCB). But still, the LA was drawing the wrong duty cycle, and the ammeter losing samples and hallucinating about ringing.

Finally, I tried a live Linux USB drive ( Pop!_OS 22.04 ) on a borrowed INTEL laptop and connected PPK2 to it without disturbing the test circuit. All the problems disappeared.

As the two pictures show, apparently the only thing PPK2 seems to get right on AMD systems currently is the average current. Everything else is dependent on chipset's luck.

BUG_AMD_after_BIOS_update BUG_Intel_fine

chrpoe commented 1 year ago

Hi there, I've got the same issue: Time is 3x too slow.

I also updated my BIOS (UEFI) to newest version -> bug persists. Then updated Chipset Drivers -> bug persists.

I have a: B450 Gaming-ITX/ac Processor AMD Ryzen 3 3200G with Radeon Vega Graphics, 3600 Mhz, 4 Core(s), 4 Logical Processor(s) BIOS Version/Date American Megatrends Inc. P5.20, 01/11/2022 OS Name Microsoft Windows 10 Pro Version 10.0.19045 Build 19045

The following is just speculation:

To be honest, I'm unsure why this hardware would/should matter? Shouldn't the time scale somehow be measured by the system clock of the host computer, where the Power Profiler is running on be used?

It seems to be somehow derived from the USB bus speed - if I get the "workarounds" and comments in the forum right. That sounds a bit strange to me.

Or does the PPK2 Hardware itself attach the timestamp to it's measured signals - and hence the PPK2 board itself derives it's own clockspeed or something from USB and not from some onboard quartz?

wlgrd commented 1 year ago

The PPK2 always samples at 100kHz no matter what using it's own clock sources which are highly accurate and independent of whatever is connected to it, and has a counter included in the payload so that the software can use this as ticks for timestamping. If timing were derived from the computer and you have jitter on your USB, then you would have a problem due to that instead (getting wrong timestamps on the data instead of "slow" timestamping).

As this only happens to certain AMD chipsets, and not when using the same chipsets and a USB 3 hub or an Intel chipset, this is quite challenging to figure out and something we are working continuously on. The PPK2 uses bulk transfers, and it seems that AMD has implemented this differently for some chipsets than other vendors. I have also not seen this reported on any other chipsets than their B-series so that might be a factor.

Since the PPK2 uses bulk (the same as USB hard drives), the fact that the computer is "working fine with other USB devices" might not be correct, as you wouldn't notice or care about USB transfer speed stability from a USB stick. We are working on this,

If you zoom in on the chart, do you see any missing data points?

@RenanSP13 You mention "But after tweaking many options on BIOS"; which tweaks exactly?

chrpoe commented 1 year ago

Thanks for working on this! I understand it's not that straightforward.

I don't know if this helps, but I checked different USB ports and some of them work correct:

This is the graphic from my Mainboard manual (https://www.asrock.com/mb/AMD/Fatal1ty%20B450%20Gaming-ITXac/index.de.asp#Manual) : grafik

Wrong time scale on connection via this ports:

1 Fatal1ty Mouse Port (USB_1) 2 USB 2.0 Port (USB_2) 3 USB 3.1 Gen2 Type-A Port (USB31_TA_1)

Correct time scale on connection via this ports: 12 USB 3.1 Gen1 Ports (USB3_12)

RenanSP13 commented 1 year ago

I appreciate very much that Nordic is searching for a fix.

@wlgrd  If you zoom in on the chart, do you see any missing data points? Yes, about 52% of the points are missing. The waveforms are completely misrepresented in a very unpredictable way: a few wave cycles are uninterruptedly captured, a few cycles are completely skipped, and most cycles fall between these two extremes. The only correct measurement is the average current over long sampling sessions (>20 min).

@wlgrd which tweaks exactly? I changed so many things (PCI speeds, CPU power profile, etc) that I can't point out the individual contributions, but I am sure that updating to the latest BIOS (which in my case is 7A39v2P5 BETA) brought most of the improvement (from 32% of real time to 48% of real time). That being said, I am not convinced that PPK2's performance is being limited exclusively by the motherboard. The reason is that I measured the performances of many generic USB 2.0 devices on all USB ports of my PC, and they all appear to outperform (regarding bandwidth) the PPK2 by orders of magnitude. For example, below, I compare the input bandwidths of:

1

2

RenanSP13 commented 1 year ago

Also, it appears that USB cable length plays no significant role. When plugged directly into the motherboard, I saw no measurable difference when moving from a 150 cm cable to a 20 cm cable.

But, just like @chrpoe , when switching USB ports, I saw significant changes in the data drop rate measured with chronometers (followed by equal changes in the bandwidth measured with the "sudo usbtop" command). Unfortunately, I could not get any result better than 48%, even on USB 3.1 Gen 1.

wlgrd commented 1 year ago

Unfortunately, I could not get any result better than 48%, even on USB 3.1 Gen 1.

Oof, that was not what I wanted to hear. @chrpoe gave me a small hope there ;D We really appreciate the effort you put into giving proper feedback, this is very valuable. If we cannot come up with a consistent error description and a working solution, we will need to add some notice about this.

Can you give me the full description of your CPU/Chipset/Motherboard @RenanSP13 ?

I have an AMD system at home and have gotten several to test on their AMD systems; reports are that it is bad, but after a chipset update, it is working fine. Same on my own setup and 0% reports on Intel hardware. I will try and escalate this and collect more data points.

wlgrd commented 1 year ago

Also @RenanSP13 ; what did you use to load USB traffic from the PPK2 while measuring the USB speed?

RenanSP13 commented 1 year ago

I understand and appreciate the effort.

@wlgrd an you give me the full description of your CPU/Chipset/Motherboard?

what did you use to load USB traffic from the PPK2 while measuring the USB speed? While monitoring the output of the "sudo usbtop" command, I simply asked the PPK2 (in "source meter" mode) to "Start" sampling. By doing that, the measurement goes from 0 kb/s (idle) to about 283 kb/s (AMD) or about 590 kb/s (INTEL). By the way, 590 kb/s is a surprising low bitrate, despite the fact that PPK2 was performing just fine. Perhaps the "sudo usbtop" command is misreading something, even though USB bulk transfers shouldn't be too challenging to measure. If you need me to test different scenarios for you, just say the word.

wlgrd commented 1 year ago

Thank you very much.

It's not surprisingly low as the PPK2 doesn't actually transmit a lot of data. It samples at 100kS/s, and every sample is 32bit(4byte). That's ~400kB/s + some overhead. Btw, the creator of usbtop says the kb/s displayed is actually KiB/s

RenanSP13 commented 1 year ago

I didn't know that about usbtop's notation. It makes more sense now.

Manawyrm commented 2 months ago

Also affected on ASUS PRIME X470-PRO, BIOS 6223 (03/19/2024), Linux 6.10.5.

This did cause a weird (display) side effect, which I've reported over here: https://github.com/NordicSemiconductor/pc-nrfconnect-ppk/issues/476

Inserting a USB hub did fix the issue.

RenanSP13 commented 2 months ago

Inserting a USB hub did fix the issue.

@Manawyrm, could you please share the model of the USB hub? I tried one I already had to no avail.

Manawyrm commented 2 months ago

@Manawyrm, could you please share the model of the USB hub?

https://www.amazon.com/AmazonBasics-Port-2-5A-power-adapter/dp/B00DQFGH80

It reports on Linux/lsusb as:

Bus 005 Device 002: ID 2109:2811 VIA Labs, Inc. Hub

in other words: this is probably a VIA VL813-based hub.

RenanSP13 commented 2 months ago

Thanks.

I had no luck on a hub that reports on Linux/lsubs as: Bus 003 Device 040: ID 03eb:0902 Atmel Corp. 4-Port Hub