xyphro / UsbGpib

Versatile, cheap and portable USB to GPIB converter (USBTMC class based)
MIT License
288 stars 51 forks source link

Reading status byte interrupts writing/reading. #36

Open mariusCZ opened 1 year ago

mariusCZ commented 1 year ago

Hi,

Sorry for opening a second issue during the same day, but I'm facing another weird issue when using Labview (this time natively on Linux). Basically, if I write data and right afterwards read the status byte, the data writing gets interrupted and does not continue.

Now this may be an issue with Labview, so I will try repeating the same with pyVISA and see if I get the same thing.

mariusCZ commented 1 year ago

The problem persists with pyvisa too. Here's an example that causes the issue:

import pyvisa as pv

rm = pv.ResourceManager()
lockAmp = rm.open_resource('USB0::0x03EB::0x2065::+00000_+00000::INSTR')

lockAmp.write("ID")
print(lockAmp.read_stb())
print(lockAmp.read())

For reference, I am interfacing the Signal Recovery 7265 DSP Lock-In Amplifier. The lock-in amp also has a communication test menu, which shows what data is received and read from, and if I run this program, only part of the data is written (like only "I" or "ID" without the CR and LF). This thing is fairly old so there may also be some sort of incompatibility with it.

EDIT: Looking at the code, I see there's a hardcoded timeout within the status byte reading. Assuming (not sure, haven't investigated the code in detail) you are using an interrupt to handle the status byte read request, the long execution time of the interrupt may be screwing with the data writing. However, this is just my guess. What is the reason for the timeout in the status byte reading?

EDIT 2: I see now that there is no delay with the status byte read. However, it confuses me then what is causing the data write fail to finish.

EDIT 3: I noticed that status byte reading writes also to GPIB. Once again, assuming it uses interrupts, could it be that the stb read replaces the data in the registers that was supposed to be written?

xyphro commented 1 year ago

Hmm, also unexpected. In case it mqnifests to be a real issue and you want to share that sequence and let me test here you can also mail a screenshot of it to xyphro at gmail.com (not sure if pictures can be shared here)

mariusCZ @.***> schrieb am Di., 25. Juli 2023, 16:25:

Hi,

Sorry for opening a second issue during the same day, but I'm facing another weird issue when using Labview (this time natively on Linux). Basically, if I write data and right afterwards read the status byte, the data writing gets interrupted and does not continue.

Now this may be an issue with Labview, so I will try repeating the same with pyVISA and see if I get the same thing.

— Reply to this email directly, view it on GitHub https://github.com/xyphro/UsbGpib/issues/36, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFXEGAFTIFC7DDHQKXNBJ3XR7JLZANCNFSM6AAAAAA2XFBIM4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

mariusCZ commented 1 year ago

I have sent the images. Would it perhaps be a good idea to disable interrupts during gpib writing?

xyphro commented 1 year ago

Stb is if I remember correctly not using interrupts. Confidence is on that part only 80%, it is a long time ago.

mariusCZ @.***> schrieb am Di., 25. Juli 2023, 17:19:

I have sent the images. Would it perhaps be a good idea to disable interrupts during gpib writing?

— Reply to this email directly, view it on GitHub https://github.com/xyphro/UsbGpib/issues/36#issuecomment-1650045155, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFXEGHBBIC64GWCP4IKWXLXR7PYXANCNFSM6AAAAAA2XFBIM4 . You are receiving this because you commented.Message ID: @.***>

mariusCZ commented 1 year ago

STB itself doesn't look like it uses interrupts indeed, but I am now looking at the LUFA library and it looks like maybe the events use interrupts? In which case disabling interrupts may result in some other problems with USB possibly.

EDIT: here is (I think) the code related to calling the events. The ISR at the bottom contains the USB_Device_ProcessControlRequest(), so the problem may indeed be related to interrupts.

xyphro commented 1 year ago

Short update: I confirm this bug. The issue is not directly related to the interrupts, but it is a synchronization issue.

Readstb uses control transfers while normal writes are using bulk transfers.

2 error conditions are identifed with same failure mode and 2nd level cause:

1st: it can happen as function of how things are scheduled by the usb host and visa driver that the control transfer gets executed while a gpib write is handled. As readstb executes also gpib transfer, it can happen that it disturbs the normal gpib transfer. This behaviour can even vary depending on usb host controler type.

2nd: when writing to gpib data that span across multiple bulk packets it surely hapoens that the control transfer is executed during 2 bulk transactions, ibdependent of usb host scheduling. This behaviour is independent of host controller type, but more driven by gpib device latency and speed (or shorter: timing).

Solution is under development: the readstb control transfer will get properly synchronized within the firmware.

I planned an update the firnware today, but decided to analyze it a bit deeper, which was good, cause I found this second failure mode.

Well spotted, thanks for reporting and stay tuned for an update!

mariusCZ @.***> schrieb am Di., 25. Juli 2023, 18:37:

STB itself doesn't look like it uses interrupts indeed, but I am now looking at the LUFTA library and it looks like maybe the events use interrupts? In which case disabling interrupts may result in some other problems with USB possibly.

— Reply to this email directly, view it on GitHub https://github.com/xyphro/UsbGpib/issues/36#issuecomment-1650172944, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFXEGHCBZ4WY2GVNCFAT7LXR7Y25ANCNFSM6AAAAAA2XFBIM4 . You are receiving this because you commented.Message ID: @.***>

xyphro commented 1 year ago

Hi!

I made a lot of changes, especially on this part. Will push code this evening (berlin timezone).

Best regards,

Kai

mariusCZ commented 1 year ago

Thank you, I will test it once you push it and if it works I will close the issue!

xyphro commented 1 year ago

I pushed it! Curious what your observations will be!

mariusCZ commented 1 year ago

Unfortunately the same issue persists! I tested it with the same device and the same Python script test case and it still fails. This time around it manages to consistently write two bytes of data before getting interrupted by stb reading.

xyphro commented 1 year ago

Unfortunately the same issue persists! I tested it with the same device and the same Python script test case and it still fails. This time around it manages to consistently write two bytes of data before getting interrupted by stb reading.

I will pick this up again next after solving the hp34001 issue from Chris!

xyphro commented 1 year ago

Hi Mariusz,

I want to focus now on this one again. Do you think it would be possible to do a capture with wireshark of that issue? Ideally based on the latest firmware from 30th October?

Otherwise I will provide you this evening a debug firmware for an experiment adressing a suspicion I have. Basically I'd like to check if it is usb side or gpib related by stubbing out the gpib side read status byte code.

Best regards,

Kai

mariusCZ commented 1 year ago

Hi,

I don't mind doing both. But it'll have to be tomorrow since today is a day off and I am not in the lab.

On Wed, Nov 1, 2023, 06:43 xyphro @.***> wrote:

Hi Mariusz,

I want to focus now on this one again. Do you think it would be possible to do a capture with wireshark of that issue? Ideally based on the latest firmware from 30th October?

Otherwise I will provide you this evening a debug firmware for an experiment adressing a suspicion I have. Basically I'd like to check if it is usb side or gpib related by stubbing out the gpib side read status byte code.

Best regards,

Kai

— Reply to this email directly, view it on GitHub https://github.com/xyphro/UsbGpib/issues/36#issuecomment-1788443636, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARGMQJH6WQZ3PJ4IWSQKSX3YCHORDAVCNFSM6AAAAAA2XFBIM6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBYGQ2DGNRTGY . You are receiving this because you authored the thread.Message ID: @.***>

xyphro commented 1 year ago

I don't mind doing both. But it'll have to be tomorrow since today is a day off and I am not in the lab.

Great! I can also only do something this evening as I first need to work :-) [no vacation in Neterhlands today :-/]

Will then make such a special firmware image and share with u.

xyphro commented 1 year ago

TestAndMeasurement_TestA.zip TestAndMeasurement_TestB.zip

Hi Marius!

Attached are 2 firmware images (in hex and bin format, whatever you require).

TestB is NOT reading over Gpib the status byte, but always hardcoded reporting 0xAB back as status. TestA is basically the current baseline.

My expectation is:

On top it would be good if you could share your testcase, such that I can correlate Visa transactions with USB traffic. You can of course strip the testcase down, I don't need your exact file as you use it for measurement purposes.

The TestB image test will show me where to focus on in further root cause analysis. TestA image will show me the timing behaviour. From that one I can see what happens: Does the ControlIin transfer related to status reading timeout? From the timestamps of the relevant section I will be able to make also conclusions on the internal behavior of the firmware.

Thanks a lot upfront for your support Marius. Hope we nail that one down very soon too!

Best regards,

Kai

mariusCZ commented 1 year ago

Hi Kai,

Thank you for all the help! I will try to test this today, but I may be a bit busy so not sure if I can get it done. If not, I should be able to do it tomorrow.

As for the test case, right now I am just using a very simple Python script, it is like this:

import pyvisa as pv
import time

rm = pv.ResourceManager()
lockAmp = rm.open_resource('USB0::0x03EB::0x2065::+00000_+00000::INSTR')

lockAmp.write("ID")
print(lockAmp.read_stb())
print(lockAmp.read())

As for the actual program, it is not really finished right now as it does not work properly, so I can share it, but I don't think there's much use to it since it's a bit more convoluted and as far as I know, doesn't do anything special that may impact behaviour.

xyphro commented 1 year ago

Thanks a lot Marius. No need to hurry, do it when it fits and does not cause trouble for you!

mariusCZ commented 1 year ago

debug.zip

Hi Kai,

I managed to get it done today. So for Test A you said that the program stops at read. I was not sure if you meant the program stops immediately after reading or whether there should be no reading done at all, so I did both cases for both of the tests. In noread captures there was no command to read from the GPIB and in read captures there was a command to read from the GPIB.

Test A sometimes actually worked with reading. As in, some times it reads STB and manages to read the output from the device, but not always, sometimes it received a timeout error. As for Test B, reading the STB and the output of the device consistently worked. It also returned 171 STB in both cases consistently.

Please let me know if there's anything else that needs to be done or if the captures are not correct.

xyphro commented 1 year ago

Hi Marius,

thanks a lot! I downloaded, can open the logs and analyze now!

xyphro commented 1 year ago

The captures are excellent, Marius!

grafik Here we can see the following: No. 32 is the "Read status byte" request from the PC The Read status byte is correctly replied back in 2 ways:

Up to this point everything is fine.

Then the read transfer is triggered at Packet #36. It is also correctly replied back by the device in #38. In Packet #37 the read phase of the GPIB data is requested from the PC.

This transfer times out. Only on Packet #41 - 2.1seconds after the actual read was triggered it is replied back which is at a point where the stack already decided to unstall the endpoint (Request #39).

So something times out. Due to the fact that the bulk in packet happens, this will not be a USB timeout, but a GPIB timeout of the GPIB read.

Let's look at the same phases of Test B: grafik

The 2 highlighted packets match the highlighted one of the previous picture. Here we see that the bulk in reply is immediately returning. 3ms after the request was made (the returned value is 7265\r\n).

Coming to a conclusion:

For now I am very confident that it is a GPIB side issue and I will dig deeper on that side.

xyphro commented 1 year ago

TestAndMeasurement_PR36_2ndNov_ATNFix.zip

I looked and saw that the ATN timing is not relaxed here (the issue I just solved together with Chris), because there was a forced ATN_HIGH (which I had overlooked when removing those constructs during the previous fixing session).

I am not 100% confident, but it is likely to be an issue for your device, messing up the GPIB side transfers.

Can you test this firmware?

TestAndMeasurement_PR36_2ndNov_ATNFix.zip

This one is like "Test A" or the baseline firmware. So it does the actual status byte readout on GPIB side and has no hardcoded return value.

Edit: no need to capture usb again if all works :-) i cross fingers! Pps: just notice that crossing fingers means here that I lie, so better: "pressing my thumb" :-)

mariusCZ commented 1 year ago

Hi Kai,

I will test this today and let you know how it is. Thank you very much for your continued effort!

mariusCZ commented 1 year ago

testPR36_read.zip

Unfortunately the same issue occurs! I have attached another packet capture. If this is indeed more of a problem related to the device I am using, then we can just close this issue, as this is not exactly related to the adapter.

xyphro commented 1 year ago

Unfortunately the same issue occurs! I have attached another packet capture. If this is indeed more of a problem related to the device I am using, then we can just close this issue, as this is not exactly related to the adapter.

No, I want all devices to work :-)

Let me dig into this further and I will compare NI USB HS 488 waveform and compare it with usbgpib one.

mariusCZ commented 1 year ago

Thank you!

Another thing I could offer is to set up a debug platform if that would be of any help. I know that the 32u4 has a JTAG interface. I have a spare 32u4 laying around and a spare GPIB connector. The only thing is that the JTAG pins in this project are used for I/O, but I can add a 595 shift register for the data output and shift the I/O around to free up the JTAG pins. This would allow a more detailed look at how the interrupts occur, the variables, call stack and etc.

xyphro commented 1 year ago

I compared...

The NI adapter sends:

  1. SPE (serial poll enable)
  2. TF (make the instrument work in Talk mode)
  3. Then it reads
  4. SPD (serial poll disable)
  5. UNT (untalk => make all bus participants stop talking)

What I do has Step 4 and 5 reversed.

I know that the following read fails for you and not the readstb, but it might be possible that your intstrument is not happy getting Untalk and receiving SPD afterwards.

But let me look further.

xyphro commented 1 year ago

TestAndMeasurement_5thNov_reversedUNT_SPD_IN_RSTB.zip

Hi Marius,

let's give it a try. The order described in previous post seems to be pretty standard. Found it back in a few GPIB documents. So it could be, that your (and maybe other) instruments strictly support only that order.

Attached firmware has this order now implemented.

grafik

Best regards,

Kai

xyphro commented 1 year ago

Btw: no usb captures required, as the issue seems to be on gpib side.

I value btw. your idea of jtag debug, but are afraid that it won't help here a lot - I can tell you already where the timeout happens, but not why. Jtag debug to the instrument might help and seeing the state of the jtag statemachine, but that one is closed source.

mariusCZ commented 1 year ago

Hi Kai,

I am sorry to say, but it still does not work. I'm starting to think this is perhaps an issue with the device I am using rather than anything else. Which could be a possibility since the serial port on the device does not work at all, so I guess it wouldn't be surprising something with the GPIB is not working. Perhaps we should close this issue?

xyphro commented 1 year ago

:-( It might be the case, but the only way we'd get confidence is to use another adapter and try out. I'd not close this ticket now. Maybe somebody else has the same issue and we get new insights. In case you have in future any updates, let me know! But for now I am really out of ideas :-(

mariusCZ commented 1 year ago

That's okay! Thank you very much for your continued efforts! I will let you know in case something pops up.

xyphro commented 12 months ago

Hi Marius,

I just made bigger changes (transfer speed increase by factor 7 and 4 :-) with blazing 512KByte/s for writes) and test on all my gpib equipment. When testing on my cmu200 I saw that a read status byte works, but generates a query error (-400). Debugging it right now, but it might relate to what you see.

Best regards,

Kai

xyphro commented 12 months ago

Found the cause and fixed it. However it is based on an untested development firmware and I don't want to "annoy you away" by introducing new issues, so let me test further and provide you afterwards a test fw.

mariusCZ commented 12 months ago

I'm happy to test whether it's untested or tested! I may not be able always able to do it, but I'm more than happy to!

xyphro commented 11 months ago

TestAndMeasurement_Check_readstb.zip

Hi Marius,

sorry - had a busy time. Can you check this once? It fixed ReadStb on CMU200 for me.

Best regards,

Kai

Edit: It contains many other changes, e.g. SRQ is working :-) Also a general speedup is part of it.

mariusCZ commented 11 months ago

Hi,

Sorry for the late reply, was a bit busy! Unfortunately, the issue still persists. I think it's safe to say at this point that the issue is with my device rather than the adapter. Thank you for your continued efforts!

xyphro commented 10 months ago

Damn :-(

I did invest lots of time into extensive testing of readstatusbyte, also exactly mimicing the niusb waveforms. So let's indeed assume for now it is instrument related. I leave this still open for others to jump on it, as a little bit of doubt is left.