archiver-appliance / epicsarchiverap

This is an implementation of an archiver for EPICS control systems that aims to archive millions of PVs.
Other
38 stars 37 forks source link

Misrepresentation of data when using the Optimized PostProcessor #111

Open rjwills28 opened 3 years ago

rjwills28 commented 3 years ago

We have seen an issue with the AA Optimized post-processor when returning values for bins that contain 0 samples. When a bin contains samples there are no problems, the Optimized post processor returns the mean, standard deviation, min, max and number of samples. However when a bin contains no samples, the current implementation of the Optimized post processor is such that it inherits the values from the previous bin that did contain samples. This leads to a misrepresentation of the data, which we have observed when plotting the results. For example, say a bin that contains samples has a mean of 20 and the last value in that bin reflects the PV going to 5. The next 10 bins have no samples (i.e. no change in the PV) and so the value of the bin with mean 20 is inherited for all of these. This does not reflect the fact that the PV actually had the value of 5 for this period of time.

We have a proposed and tested fix for this issue. This involves creating a new post processor named 'OptimizedWithLastValue'. This post processor always keeps track of the last sample to be added to each bin. If a bin contains no samples, the mean, min and max are determined using the last sample added to the previous bin that contained samples. The number of samples would also be set to 0.

For example:

This means that in periods where the PV does not change, the last value it reported is the one that is returned.

The proposed changes that would be required to the AA include:

willrogers commented 3 years ago

We think this would solve some of the user problems we've seen when using the AA and CS-Studio at DLS, and adding another PostProcessor shouldn't be a problem I guess.

What do you think @slacmshankar ?