Open John-Holt-Tessella opened 5 years ago
Would it be possible to catch this type of issue via nagios before we fill up the disk and trigger the main disk space warning?
Since the logs rotate daily, a check on whether any file in the ioc logs directory exceeds a certain size (e.g. 100MB) would be sufficient...
I can see no reason why the Separator IOC needs to log a connection failure 30 times a second. We should get it fixed so that we don't have to configure nagios to work around the problem.
Here is a transcript when you disconnect the DAQ in the office (pull out the ethernet cable) when the separator IOC is running:
2018/11/07 10:21:08.497 ### DAQmx ERROR (ReadAnalogF64): Some or all of the samples requested have not yet been acquired.
To wait for the samples to become available use a longer read timeout or read later in your program. To make the samples available sooner, increase the sample rate. If your task uses a start trigger, make sure that your start trigger is configured correctly. It is also possible that you configured the task for external timing, and no clock was supplied. If this is the case, supply an external clock.
Property: DAQmx_Read_RelativeTo
Correspon
2018/11/07 10:21:21.251 ### DAQmx ERROR (StopTask): Some or all of the samples requested have not yet been acquired.
To wait for the samples to become available use a longer read timeout or read later in your program. To make the samples available sooner, increase the sample rate. If your task uses a start trigger, make sure that your start trigger is configured correctly. It is also possible that you configured the task for external timing, and no clock was supplied. If this is the case, supply an external clock.
Property: DAQmx_Read_RelativeTo
Correspon
No more error messages were created.
Here is a transcript when you start an IOC with the DAQ in the office is disconnected:
epics>
epics>
epics> 2018/11/07 10:24:52.637 ### DAQmx ERROR (StartTask): Retrieving properties from the network device failed. Make sure the device is connected.
Device Specified: cDAQ9185-R3G39
Property: DAQmx_Dev_TCPIP_EthernetIP
Corresponding Value: 130.246.50.212
Property: DAQmx_Dev_TCPIP_Hostname
Corresponding Value: cDAQ9185-R3G39
Device: cDAQ9185-R3G39
Task Name: R0
Status Code: -201401
The Muon separator IOC can generate 5GB of logs a day in an error condition. Log was
this is repeated more than 30 times a second.
-I am not sure whether this is because it was disconnected or whether it is because there was an error in the module configuration.- The error was caused by a module configuration problem where the name was mis-spelt. This is likely at the DAQMX layer.