paulscherrerinstitute / pcaspy

Portable Channel Access Server in Python
BSD 3-Clause "New" or "Revised" License
33 stars 24 forks source link

PV value to 0 #40

Closed gtortone closed 7 years ago

gtortone commented 7 years ago

Hi, I develop this simple (soft) IOC with pcaspy: https://gist.github.com/gtortone/514302df79439bc59a9c905c48e8543f

It simulates 96 temperatures and 32 humidity values between some ranges; everything works but after 15-20 minutes the script starts to send value for each PV equal to 0... (I check them in a camonitor/caget command shell)

I also debug my code and values generated is always 'valid', the problem seems to be with CA interface where the value set to 0 after some minutes;

Thanks

xiaoqiangwang commented 7 years ago

Which pcaspy/EPICS base version are do you run?

I want to reproduce the problem. Should I put camonitor on all the PVs, or just one is enough to show the bug?

xiaoqiangwang commented 7 years ago

@gtortone I have run your test script over night and camonitor S1F:TEMP:TEMP01. The bug does not appear. My tests is on 64bit Scientific Linux 6 with Python 2.6.

Could you specify your environment?

gtortone commented 7 years ago

Hi, thanks a lot for your support on this issue.

After some tests I realized that python script run on low memory because I'm running it on a virtual machine with other running processes...

I will return to you if I have other issues

Thanks again

gtortone commented 7 years ago

Hi, now I'm running Python IOC on a virtual machine with Debian 8, python 2.7, and pcaspy 0.6.4 installed with pip tool.

The same virtual machine also runs a EPICS Archiver Appliance and is equipped with 16GB RAM (~ 8GB free);

I run the IOC I developed and after 15~20 minutes I have the 'PV = 0' problem... this is a fragment of camonitor output

... S5B:TEMP:TEMP02 2017-04-26 15:52:27.612569 24.5879
S5B:TEMP:TEMP02 2017-04-26 15:52:29.619755 0
S5B:TEMP:TEMP02 2017-04-26 15:52:31.626389 0
S5B:TEMP:TEMP02 2017-04-26 15:52:33.632743 0
S5B:TEMP:TEMP02 2017-04-26 15:52:35.638917 0
...

I run the Python IOC inside a detached 'screen' session... and the strange things is that if I run in sequence 'camonitor' commands the first value returned is not equal to 0...

(first execution) S5B:TEMP:TEMP02 2017-04-26 15:54:19.975847 25.1536
S5B:TEMP:TEMP02 2017-04-26 15:54:21.983033 0
S5B:TEMP:TEMP02 2017-04-26 15:54:23.989749 0
^C (second execution) S5B:TEMP:TEMP02 2017-04-26 15:54:23.988889 25.1237
S5B:TEMP:TEMP02 2017-04-26 15:54:25.996329 0
S5B:TEMP:TEMP02 2017-04-26 15:54:28.002633 0
^C (third execution) S5B:TEMP:TEMP02 2017-04-26 15:54:30.008165 25.1909
S5B:TEMP:TEMP02 2017-04-26 15:54:32.015343 0
S5B:TEMP:TEMP02 2017-04-26 15:54:34.021718 0
^C

xiaoqiangwang commented 7 years ago

The first output from camonitor is actually a caget. And the following values are received when server application calls updatePVs.

I will run this program in a docker and report back.

gtortone commented 7 years ago

Hi, I can add some details on this issue;

I run an EPICS Archiver Appliance that consumes 128 PVs from this Python IOC. I just realized that when the problem of 'PV = 0' appears, if I stop the archiver the problem stops and IOC start to publish correct PVs value.

It seems that a number of connection threshold is reached after 15~20 minutes of archiver PV consuming and this threshold (in pcaspy) is responsible of this instability...

gtortone commented 7 years ago

additional details: when the problem appear with 'camonitor' if I run the following script:

for i in seq 1 100; do caget S1F:TEMP:TEMP01; done

the problem of 'PV = 0' does not appear... it seems something related only to camonitor

xiaoqiangwang commented 7 years ago

When camonitor shows zero, what value does the archiver get?

gtortone commented 7 years ago

archiver gets zero

gtortone commented 7 years ago

I also enabled logging module inside my Python script and values printed on console are always 'valid' (not zero)

xiaoqiangwang commented 7 years ago

If you turn off the archiver application, will the problem appear in ~20 minutes as well?

gtortone commented 7 years ago

I will check

gtortone commented 7 years ago

I can confirm that after ~40 minutes (with archiver stopped) problem does not appear.

gtortone commented 7 years ago

I started archiver again (without restart IOC) and problem appears after ~2 minutes...

xiaoqiangwang commented 7 years ago

That sounds some resource problem. I would replace the archiver with a shell script, which monitors on these 108 PVs. And see whether the problem appears.

gtortone commented 7 years ago

Hi, I realized I'm using EPICS 3.15.2 as a part of my deployment... now I upgraded to EPICS 3.15.5 and testing again Python IOC with EPICS archiver;

I will inform you about results

gtortone commented 7 years ago

ok, with EPICS 3.15.2 and 3.15.4 the problem is present... with EPICS 3.15.5 the Python IOC runs fine without any problem !

I will run a test for all the night and in case of problem I will return to you !

Thanks for support !

xiaoqiangwang commented 7 years ago

I thought you installed pcaspy via PyPI. The eggs have EPICS 3.14.12.6 included.

EPICS 3.15.5 and 3.14.12.6 contain important bug fixes related to PCAS.

In the meantime I have created a script to simulate an archiver by monitoring the PVs. https://gist.github.com/xiaoqiangwang/874d7d87c60a6ab68c0f303460828885 It runs for 10mins now.

gtortone commented 7 years ago

I installed pcaspy with pip with:

pip install pcaspy

and compiled and installed EPICS manually....

xiaoqiangwang commented 7 years ago

I see. Then it is linked with your local EPICS libraries.

To use the eggs,

easy_install pcaspy
xiaoqiangwang commented 7 years ago

@gtortone could I close this issue now? Also be aware of issue #43, which requires patching EPICS base if you compile from source.

gtortone commented 7 years ago

Yes, please close this issue. Thanks.