paulscherrerinstitute / StreamDevice

EPICS Driver for message based I/O
GNU General Public License v3.0
28 stars 42 forks source link

StreamDevice resets asynTraceMask during init #23

Closed chrschroeder closed 5 years ago

chrschroeder commented 5 years ago

Commit 8679dd543bfcf4000833b1c47019320f23c96476 introduced a reset of the asynTraceMask to 0 to avoid flooding. I don't think StreamDevice should alter the options set for another module (asyn). Normally the TraceMask should be 0 anyway if not set explicitly. In my case I set it directly after setting up the asyn port for development and debugging purpose. You can re-enable it after iocInit as stated in the comment, but this prevents messages send and received with @init to be printed.

Another similar thing is the newly introduced streamError variable which can be used to avoid message flooding. I really appreciate the possibility to disable these messages, but during normal operation you want to see them and they are very useful for analysis. My suggestion would be, to the the default value for streamError to 1, which equals to the behavior of older versions.

dirk-zimoch commented 5 years ago

I have introduced the streamError variable because people had complained about message flooding, in particular when the device gets powered down while the IOC is running. Nevertheless all stream errors during startup are always printed, so that protocol errors and initialization errors are not hidden. On the other hand, there is message throttling now in the asyn interface, which should silence repeating errors. I am not sure about the default value of asynTraceMask, but if I have disabled it, there was a reason. So it seems that at least some asyn versions had it enabled at startup. I will check...

dirk-zimoch commented 5 years ago

The default trace mask is 0x01 (at least for asyn IP ports). Asyn version 4.27 printed "autoConnect could not connect" messages every 20 sec when the device was offline. Asyn version 4.34 does not do that any more.

chrschroeder commented 5 years ago

One of the complaints came from our site. Benjamin has commited 2 patches as you know and also talked with the asyn developers about it. Ideally error messages should not be suppressed in general, but printed only once e.g. after a device turns offline and after it turns online again. Benjamin also told me, that this might not be easy to implement for all type of messages. Nevertheless the error messages are important for operation. The option to disable them is still useful, but I still think the default should be to print them resp. streamError should be 1 by default.

As for the trace mask, you are right, traceError is always true. For development I often set traceIODriver additionally (bit mask 0x9) to print send and received messages. With StreamDevice resetting this, some messages aren't printed anymore as stated above. In my opinion the asyn traceMask isn't something StreamDevice should be concerned with. Users who want to disable all errors are still free to set it explicitly to 0x0.

dirk-zimoch commented 5 years ago

Agreed. StreamDevice should not manipulate the debug settings of other drivers as this may interfere with the user's intention. Can you please change this and the default value of streamError locally in your copy and tell me your experience. Is there any message flooding when the device does not respond?

chrschroeder commented 5 years ago

I compared version 2.7.7 and 2.8.7. The worst part got already fixed by Benjamin and errors are printed only once until recovery. He told me, that the messages which are not handled this way are most likely protocol errors, but I think that's no problem for normal operation since developers have to fix these anyway.

Here a comparison StreamDevice 2.7.7:

epics> 2018/12/05 08:43:55.519071 cbLow DIODETEST2:rdCur lockRequest: pasynManager->queueRequest() failed: port DIODETEST2 not connected 2018/12/05 08:43:55.519160 cbLow DIODETEST2:rdStb lockRequest: pasynManager->queueRequest() failed: port DIODETEST2 not connected 2018/12/05 08:43:55.519169 cbLow DIODETEST2:rdEsr lockRequest: pasynManager->queueRequest() failed: port DIODETEST2 not connected 2018/12/05 08:43:56.519053 cbLow DIODETEST2:rdCur lockRequest: pasynManager->queueRequest() failed: port DIODETEST2 not connected 2018/12/05 08:43:56.519081 cbLow DIODETEST2:rdStb lockRequest: pasynManager->queueRequest() failed: port DIODETEST2 not connected 2018/12/05 08:43:56.519092 cbLow DIODETEST2:rdEsr lockRequest: pasynManager->queueRequest() failed: port DIODETEST2 not connected 2018/12/05 08:43:57.519080 cbLow DIODETEST2:rdCur lockRequest: pasynManager->queueRequest() failed: port DIODETEST2 not connected

and so on...

And StreamDevice 2.8.7:

epics> 2018/12/05 08:45:34.581503 cbLow DIODETEST2:rdCur lockRequest: port DIODETEST2 not connected 2018/12/05 08:45:34.581690 cbLow DIODETEST2:rdStb lockRequest: port DIODETEST2 not connected 2018/12/05 08:45:34.581704 cbLow DIODETEST2:rdEsr lockRequest: port DIODETEST2 not connected 2018/12/05 08:45:37.581486 cbLow DIODETEST2:rdCur lockRequest: status returned to normal 2018/12/05 08:45:37.581512 cbLow DIODETEST2:rdStb lockRequest: status returned to normal 2018/12/05 08:45:37.581519 cbLow DIODETEST2:rdEsr lockRequest: status returned to normal

You get them once per record, but this is OK too. Previously in StreamDevice 2.7.7 you got them once per record which lead to a lot of logged error messages, if the device was offline for a longer period. So I think setting streamError to 1 as default will be no problem.

dirk-zimoch commented 5 years ago

Fixed in eb9f565.