Fazecast / jSerialComm

Platform-independent serial port access for Java
GNU Lesser General Public License v3.0
1.35k stars 287 forks source link

Different device behavior given operating system #499

Closed EricCraigen closed 1 year ago

EricCraigen commented 1 year ago

Hello,

I am developing a cross-platform Kotlin/Java application that consumes jSerialComm. We use an industrial barcode scanner that is connected to the system via a USB to Serial cable (USB A on the system side / RJ-50 on the device side).

When our application "detects" serial devices we open the SerialPort and immediately start sending commands to verify that this device is the device we are interested in.

This works perfectly when running on macOS and Windows. However, on Linux, this device will only respond with an error (even though the command is valid and should return as expected) for 5 seconds after the port is opened. If you wait 5 seconds before sending commands they return as expected. Also notable, if I read from the port before this 5 seconds I get data back that I do not expect nor see on other OS's.

The device does not require a 5-second delay on other operating systems after the port is opened for the first time after a power cycle.

The device manufacturer does not supply a device driver for Linux so we are simply using the usbserial generic Linux driver.

Other devices we support do not exhibit this behavior, so I understand that this is likely not a bug in the jSerialComm library but more likely a device-related issue or some configuration option being incorrect. Or could possibly even stem from the interplay between the device's USB/RS-232 converter and the Linux usbserial driver.

Just wanted to see if anyone else out there has experienced anything similar and may be able to shed some light for me.

Thanks in advance!

hedgecrw commented 1 year ago

Thanks for the bug report @EricCraigen. There was a new library version released today (2.10.1) that has some significant configuration simplifications under the hood. Would you mind testing the latest release, and letting me know if you are still seeing this issue on Linux. Thanks!

EricCraigen commented 1 year ago

Awesome, always good to see releases get out the door!

Unfortunately, the new version does not fix this issue.

hedgecrw commented 1 year ago

Ah well, worth a try! I don't think this is related to this library (particularly because it's timing based and there's nothing in the native library that would have anything to do with startup timing), but I'll leave the issue open for awhile in case somebody else has any ideas or wants to comment.

hedgecrw commented 1 year ago

One more thing to try: there is a function on the SerialPort object called flushIOBuffers() that you can call before opening it, and this will attempt to ensure that any data that may exist in your device's data buffer is cleared before returning to your application code after a call to openPort(). Might be worth trying to add this function and see if that helps anything...

EricCraigen commented 1 year ago

Thanks for the additional advice. However, we are already calling that in the SerialDevice's open method. When opened, we first set the ports configuration (parameters/timeouts), then call flushIOBuffers(), and finally open the SerialPort.

I had gone down that road first and thought that maybe I was falling victim to this issue with USB to Serial connections. However, we have other devices we develop for that use USB to Serial connections and they do not display this behavior.

I agree that this is not likely related to the library but rather an issue with the device and/or the interplay of the Linux usbserial driver and the device. Could be that all other devices we use have a USB serial converter that doesn't cause this issue, but the barcode scanners converter does.

I have also reached out to the manufacturer for further explanation so hopefully they will have some useful information for me when they get back. I had posted here just to see if this was maybe a known common issue with some devices.

Your help has been much appreciated!

Cheers!

hedgecrw commented 1 year ago

Very interesting...it was worth a shot! Please report back here when you hear back from the manufacturer. Even if it's not directly library-related, whatever you find might help other users with similar issues. (Although hearing about this issue is a first for me...you've managed to come up with a truly novel bug!)

metacodez commented 1 year ago

Hello, maybe the following is a related issue which I now see after trying to upgrade from version 2.9.3 to version 2.10.1 as the upgrade on Windows 11 runs just fine but on Arch Linux I got issues:

When running my tests of the org.refcodes:refcodes-serial-alt artifact on Arch Linux with jSerialComm in version 2.10.1 (as well as 2.10.0), build fails (so far, I have not yet been able to figure out the malfunction, sometimes not all sent data is being received, the line is used bidirectionally quite heavily with Ready-to-Send and Acknowledge ping-pongs). Building (and running the tests) on Windows 11 with jSerialComm in version 2.10.1, everything is fine. When running the tests of the org.refcodes:refcodes-serial-alt artifact on Arch Linux with jSerialComm in version 2.9.3 everything is fine, too.

I use the InputStream and OutputStream usage scenario with timeout mode set to SerialPort.TIMEOUT_READ_BLOCKING. For the (hardware) tests, I use two serial ports TTL232RG-VSW5V0 ("TTL232RG-VSW5V0") on the same machine connected as loop via a null modem wiring (on Arch Linux as well as on Windows 11). The tests open two ports in the same JVM. As the logic I test does some handshaking and CRC stuff and bears some abstractions, I also run the same testes over a virtual loopback device for verification of the logic (which just runs fine).

Any help is appreciated very much! :-)

hedgecrw commented 1 year ago

@metacodez, I don't think your issue has anything to do with the original issue in this thread, but it might be related to Issue #502 (related to baud rate issues). I'm investigating that issue now (I think I see a potential problem), and will post back when a fix has been proposed.

hedgecrw commented 1 year ago

@metacodez, could you please test the following SNAPSHOT version of the library and see if your issue persists:

SNAPSHOT Version: 2.10.2-SNAPSHOT SNAPSHOT Direct Download Link SNAPSHOT Instructions

metacodez commented 1 year ago

@hedgecrw Thanks for the fast feedback, I will try it out!

Thanks a lot for your effort and this great library!

metacodez commented 1 year ago

@hedgecrw The current snapshot 2.10.2-SNAPSHOT (2023-07-09?) has fixed the issue, thank you very much! Reading issue #505 (thanks to @markmaker) I think my issue actually boiled down to the mentioned deadlock :-) I successfully tested the current 2.10.2-SNAPSHOT on Arch Linux as well as on Windows 11 :-)

hedgecrw commented 1 year ago

Resolved with release v2.10.2.