Closed SignalWhisperer closed 4 years ago
@TehWan many thanks for this and the PR!
@IgnasJarusevicius, could you please review and merge etc.
Agreed,
A fine catch.
Simon Brown, G4ELI
From: Andrew Back notifications@github.com Sent: 04 December 2019 09:46 To: myriadrf/LimeSuite LimeSuite@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [myriadrf/LimeSuite] Repeating Gateware version mismatch problem (#287)
@TehWan https://github.com/TehWan many thanks for this and the PR!
@IgnasJarusevicius https://github.com/IgnasJarusevicius , could you please review and merge etc.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/myriadrf/LimeSuite/issues/287?email_source=notifications&email_token=AEZKU5R64F2ZVASJ6UAPDY3QW5355A5CNFSM4JVDCTM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF4LTQA#issuecomment-561560000 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AEZKU5T56PH25KVGEM7M3MTQW5355ANCNFSM4JVDCTMQ . https://github.com/notifications/beacon/AEZKU5UV5PH4WY55HEEG33LQW5355A5CNFSM4JVDCTM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF4LTQA.gif
Hi, I do not have a machine that could reproduce this problem so I will wait for results for yuor futher investigation. Anyway, looking at the PR, I would like not to change LMS64CProtocol. If it is an issue with FX3, then the second read (retry) could be done in ConnectionFX3::Read() and I don't see how calling ParsePacket() when status is bad could cause problems as the final result (e.g. result of ReadRegisters) will still contain invalid data (as I see it, calling ParsePacket() can only make it partially valid in some cases). Also, it is strange that convertStatus() passes when garbage data is returned as it should give an error if status byte in returned buffer is not 1 (maybe that data is not complete garbage/random).
Hi, Good points. I'll make the changes you mentioned.
Interesting, I've also been experiencing this problem, but haven't had the time to look into it. It's good to know that this issue may be specific to the USB host controller, I'm curious which ones appear to be problematic? I encounter this on a 32-bit ARM board, the ODROID-XU4.
I have an AMD USB 3.0 Host controller, as reported by lspci|grep USB
.
Out of curiosity, I tested it with a USB 2 port and don't get the issue at all. It seems to be isolated to USB 3, at least on my machine.
Edit: My laptop, which does not have the issue, only has USB 2.
I just tried with the latest Linux kernel (5.4.2). The issue is still there, but the garbage data is gone, zero'd out. Definitely a bug in the kernel.
@IgnasJarusevicius, do you still want to patch this for the people out there affected by the kernel bug? If so, I'll make the changes you mentioned and bring the fix only in ConnectionFX3.
@TehWan , @IgnasJarusevicius ; I'm facing the same problem (on 2 AMD PCs) and I'm not sure to understand the status of the issue. Is the patch of Dec 4 enough to fix it? Could you explain? Thank you!
@OhSoGood The patch was merged in the master branch. If you try building the latest version, do you still get the same problem?
Indeed it does. Thank you a lot! @IgnasJarusevicius : could LymeMicro provide a daily/master build somewhere, e.g. on github, myriadrf or your own website? Self-compililing is not always practical nor even doable depending on context.
@TehWan , Hi,
For me this issue is still present with my old i5 2500K based PC and onboard ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller (for USB 3.0).
Is there any chance it would be fixed soon or I'm better off switching to another USB controller/ PC hardware?
P.S. Version information: Library version: v20.01.0-gc931854e Build timestamp: 2020-04-17 Interface version: v2020.1.0 Binary interface: 20.01-1
I'm reporting this because I finally found the fix for this issue: https://discourse.myriadrf.org/t/repeating-gateware-version-mismatch-problem/1339
After a few hours of debugging and reading code, I found the culprit and the solution for this long standing issue that still stands today. I'm still working on the patch which I will submit as a pull request shortly.
Here is the problem some of us face: every other time you open the LimeSDR device, it will fail to obtain the FPGA information (Gateware version) and will say it has version 0. It's systematic on my Linux machine, every second invocation, yet on Windows I don't seem to have this issue.
The culprit is a bug in libusb for which I have not yet found a patch, where if a call to
libusb_bulk_transfer()
times out, the value of transferred bytes is set to the length requested, and the buffer is filled with bytes from the memory. (see https://github.com/libusb/libusb/issues/659)However, there are also bugs in the
ConnectionFX3
and theLMS64CProtocol
classes.In
ConnectionFX3::Read()
(and consequently inConnectionFX3::Write()
), the return value forlibusb_bulk_transfer()
is not read. In the case of a timeout,len
is set toactual
butactual
is wrongly set to the requestedlength
. The fix is to check if the return value oflibusb_bulk_transfer()
is 0 before setting the length. There is a possibility of partial transfers, but at this point it's probably better to just discard this one and to try again.In
LMS64CProtocol::TransferPacket()
, considering the issue in libusb, the call toRead()
(hereConnectionFX3::Read()
) causes the packet to be filled with junk, hence the Gateware version errors. Also, the call toParsePacket()
is always made without regards to thestatus
. Even with a longer timeout, the call to read the device returns junk nonetheless (a different issue, but on the FX3 side). This is fixed by adding a second call toRead()
when an error occurs on the first one, and proper error handling, discarding the packet if an error occurred (not callingParsePacket()
).