Koheron / koheron-sdk

SDK for FPGA / Linux Instruments
https://www.koheron.com/software-development-kit/
Other
101 stars 41 forks source link

Never receive the response of a command (or with huge delay) #593

Open jefedufeu opened 8 months ago

jefedufeu commented 8 months ago

Hello,

I've created a program in PyQt and at certain times ( ~20 - 40 min) it freezes when I retrieve the contents of a buffer via the command: data = self.client.recv_vector(dtype='uint32')

The response takes a long time to arrive, so my program freezes. How can I overcome this kind of problem?

I've thought of doing it in multithreading so as not to be blocked, but if I send two python commands at the same time to the koheron OS, how will it handle it? I could implement a queue that requests and sends them one after the other.

A try except block with TimeoutException? (from signal.SIGALRM for example)

Do you have any ideas? especially as I don't have any information in thectl log.

Thanks,

tvanderbruggen commented 8 months ago

Hi,

If I understand properly, you don't see an error message when calling journalctl -koheron-server -u 200 on the board ?

Do you see something when running dmesg ?

Which version of the OS are you using ? It could be the same issue as https://github.com/Koheron/koheron-sdk/issues/528.

The issue might also be on the client side.

Regarding multithreading, you can open several connections with the server, so you can for example have one socket open per thread. There is a lock at the level of each driver, so that only one request from one connection can be executed at a time on a given driver, but you can call simultaneously different drivers. However I don't think this will fix your problem ...

jefedufeu commented 8 months ago

Hello Mr. @tvanderbruggen and thanks again for your answer,

Yes no error message with the command journalctl -u koheron-server.service -f (seems to have a mistake in the one you send) Nothing more on the dmesg. I try to get another error today but without success it's quite random.

For the OS we were previously on the 0.19 and now we are on the last one 0.23.

Yes it seems to be the the same error because it is also when we receive huge data.

Ok thanks a lot !

Yeah i know ...

To protect myself from some random command that will have errors i run my @command python function in a detached thread and set a timeout. I will tell you in the future if this solution is acceptable. Have you any opinion about this ?

fThread = threading.Thread(target=self.getF)
fThread.start()
fThread .join(timeout=1)

Thanks again Mr. @tvanderbruggen

kth7316 commented 7 months ago

Mr. @tvanderbruggen,

You say :

you can open several connections with the server, so you can for example have one socket open per thread

I try this in a python file by setting a common TCP port

client = custom_connect(host, 35359, env.INSTRUMENT_NAME, restart="true")

But i got error : requests.exceptions.ConnectionError: Failed to connect to 192.168.1.103:35359: [WinError 10061] No connection could be established because the target computer expressly refused it

After a fast check I see that that the tcp_port is set in the server conf file : server/core/config.hpp

I will try to edit some files to have the possibility to have multiple socket per thread. But with you answer I understand that there is maybe an easier solution.

Thanks,

@jefedufeu if you have some solution I'm also interested.

tvanderbruggen commented 7 months ago

I don't know what your custom_connect does nor what you're trying to achieve with the TCP port ?

Something like this should work:

from koheron import connect

client1 = connect(host, 'my_instrument', restart=True)
client2 = connect(host, 'my_instrument', restart=False)

Now if you call sequentially client1 and client2 you don't really benefit from having 2 connections, but it might be beneficial to open one client per thread.

Obviously, you don't want to restart the instrument when opening the 2nd connection since it will close the first one.

kth7316 commented 7 months ago

Thanks Mr. @tvanderbruggen,

Oh sorry, custom_port was basically a new custom implementation to add the port parameter.

def custom_connect(host,port, *args, **kwargs):
    run_instrument(host, *args, **kwargs)
    client = KoheronClient(host,port)
    return client

I try your solution and yes the problem as you mention it was therestart = True arg but I'm not sure to understand what this argument does ?

Thanks again !

EDIT : Sorry if my question is not clear but I want to understand how it is possible that the @command that are send in parallel on the same socket can be distributed ? Which mecanism do you use ? A queue ? Thanks,

tvanderbruggen commented 7 months ago

restart = True closes the currently running instrument (in particular all the connections) are start it again. In practice it kills the server process and start it over. This acts as a reset so that the instrument is in a known state after the connect call.

Each client returned by connect opens its own socket, so if you are sending commands in parallel via different clients you are using different sockets. On the server side a separated thread is open for each socket, so the server can receive commands in parallel. As I said earlier, there is a lock a each driver so if you send in parallel commands to the same driver they will effectively be queued (albeit there is no guarantee of execution order). Commands send in parallel to different drivers can be executed in parallel by the server.