Open gsrunion opened 5 years ago
As your second test confirmed it is not JNA (it almost never is, especially with the direct mapped calls).
Without knowledge of your application internals it is difficult to suggest where to look into. Some years back I spent considerable time optimising the Linux backend for just such case (Raspberry PI IIRC) and the result was that PJC was as fast as RXTX and pretty fast.
How are you using PJC, with the event system or blocking on read/write on your own threads?
@nyholku thanks for the quick response. Our application uses OSGI (felix), Netty, and the PJC Netty adapter found here (https://github.com/steveturner/netty-transport-purejavacomm/tree/develop).
We, actually, found PJC and the Netty adapter to outperform RXTX in our application and clearly prefer the portability of PJC over RXTX. The wrapper implements a 'OIO' blocking transport. I am not keen on the Netty internals enough to know if swapping that for an NIO transport would buy much, but there might be some performance to squeezed out of it there.
Hmm, so Natty simply does (I pressume) a basically a blocking read() or write() on the InputStream/OutputStream that PJC provides?
That should be about as efficient as it gets as those are very thinly layered calls to native blocking read/write calls, there must be something that has a higher priority (than the Natty internal thread (I assume there is something like that)) and consumes CPU.
You could set a break point to stop everything and see all the threads that are running and perhaps create a piece of code to dump them and their priorities. Also a profiler like VisualVM might help.
Hmm, so Natty simply does (I pressume) a basically a blocking read() or write() on the InputStream/OutputStream that PJC provides?
Yes looking at the source the moving parts do appear to be the Input/Output streams provided by PJC.
@nyholku thanks for the input. I will bark up that tree.
For the sake of debugging why our application worked well when running on a mac and not so much on our custom linux sbc I updated the linux implementation of jtermios as such..
Comparing read and write times of hardware and on dev machine: write read sbc 100ms 134ms dev 1ms . 8ms
Further the write (complete) to full response received time: sbc 191ms dev 49ms
I have tinkered with all the purejavacomm.X environment variables and not seen a significant increase in performance.
Lastly I modified the JTermiosDemo to talk to our attached serial device such that it outputs tx time, rx time, and total interaction time.
On the dev machine a typical timing would be:
And typical timings on the SBC are as such:
The latter test shows that I can get near workstation performance out of our SBC such that it likely isn't pure lack of horsepower (with regard to JNA overhead) causing the slow performance. At this point my theory is that it is context switching overhead, as our application is highly threaded, that is causing performance drop on the target SBC. Any suggestions for things to look at?