epics-base / pvaPy

pvaPy provides Python bindings for EPICS pvAccess
https://epics.anl.gov/extensions/pvaPy/production/index.html
Other
36 stars 22 forks source link

EPICS_PVA_NAME_SERVERS: handling reconnection #95

Open swelborn opened 1 month ago

swelborn commented 1 month ago

I am having an intermittent issue with reconnection.

Here is my order of operations:

  1. Start consumer process:
EPICS_PVA_SERVER_PORT=11111 pvapy-hpc-consumer     --input-channel pvapy:image     --output-channel pvapy:image1     --processor-file /path/to/hpcDataProcessorExample.py     --processor-class HpcDataProcessor     --report-period 10     --log-level debug
  1. Start pvapy-mirror-server

    EPICS_PVA_NAME_SERVERS=<IP of Consumer>:11111 EPICS_PVA_AUTO_ADDR_LIST='OFF' pvapy-mirror-server --channel-map "(pvapy:image2,pvapy:image1,pva,100)" --report-period 5
  2. Start sim server

    pvapy-ad-sim-server -cn pvapy:image -nx 2560 -ny 2160 -dt uint16 --disable-curses -rt 13000 -fps 50

This works just fine. I am getting updates as reported by mirror server.

Now I shut down the consumer process and bring it back up with same command. Now I am getting no new updates to the mirror server. Thinking this is an issue with reconnection, I shut down mirror server and bring it back up. No updates (0 received).

Only way I can solve this is by shutting down both consumer AND mirror server, then starting them again.

BTW sim server and consumer are on the same machine, mirror server is on a different machine.

sveseli commented 1 month ago

I assume you are using the latest version (5.4.0)? This problem can sometimes happen if after your consumer was restarted, the original port 11111 was not yet freed by the OS (perhaps the old process lingers or something like that), and hence consumer simply starts up server on another port. I think you can verify this by enabling epics debugging (PVAPY_EPICS_LOG_LEVEL=0), or using netstat to look for ports.

The solution for this is to make sure that consumer processes are indeed stopped before you restart them. I believe that enabling epics logging will also allow you to see mirror server attempting to reconnect to port 11111 (as expected), and connection not succeeding.

swelborn commented 1 month ago

That's it, thanks for the tip on PVAPY_EPICS_LOG_LEVEL=0.

Is there a way to get at connection information through the CLI. e.g., somehow extracting that information from the class that calls the processor class (Controller classes, I think)?

Having connection information available somewhere would be super useful for controlling processes remotely.

swelborn commented 1 month ago

Further, it would be great if any process shut down in pvapy would wait for graceful shutdown of ports, but I am not sure how hard this would be...

sveseli commented 1 month ago

If I remember correctly, at the moment there is no epics API that would allow one to ask PVA server for connection information programmatically. If this is indeed the case this would have to be added first. I think you could get stuff printed out server information on the screen if you enable pvapy logging (--log-level debug in the consumer code or PVAPY_LOG_LEVEL=255). As far as shutdown, consumer processes already wait for their PVA servers to stop, but if existing TCP connection lingers for whatever reason, OS will think the port is not free and you will get this issue. Also, keep in mind that trying to assign server port if you are running multiple consumer processes with "--n-consumers N" will inevitably result in some not getting the port you wanted. In that case, you would have to run separate command for each consumer, and provide multiple addresses to the mirror server.

swelborn commented 1 month ago

ok. thanks for your response!

anjohnson commented 1 month ago

There is an API in pvAccessCPP, it's used here to set the PVAS_SERVER_PORT variable for the PVAServerRegister.

sveseli commented 1 month ago

@anjohnson Thanks... Yes, we can use environment variables to set the port, but I was referring to API for getting information from PVA server, such as what port it is actually running on (basically getting information about server configuration that gets printed here: https://github.com/epics-base/pvAccessCPP/blob/f1268adb8ecbacbd74bb66c172d02d9d427bedfd/src/ioc/PVAServerRegister.cpp#L108). The problem here was related to the fact that port was set via environment variable, but server actually ends up running on a dynamically assigned port because the assigned one is not free. In any case, if this API does not exist, it would not be hard to add, and may be useful if nothing else but to print some sort of warning that server is not running on the assigned port.

anjohnson commented 1 month ago

Apparently my link went to the wrong place, it should have pointed here but here is the commit when it was added showing just that change. This gets the dynamically assigned port from the running server and makes it available as an environment variable for the IOC's st.cmd script to make use of.