Closed CarolineHaslebacher closed 2 years ago
Hi, I think I had the same problem when using the keyword checkStats
in sm.taylor_diagram(sdev, crmsd, ccoef, checkStats='on')
. For example, when I give some random values to sdev, crmsd, ccoef like:
sdev = np.random.rand(4)
crmsd = np.random.rand(4)
ccoef = np.random.rand(4)
and then plot the taylor_diagram:
sm.taylor_diagram(sdev, crmsd, ccoef, checkStats='on')
The sentence seems to work fine but apparently the sdev, crmsd and ccoef does not fulfill the equation RMSs(i) = sqrt(STDs(i).^2 + STDs(1)^2 - 2*STDs(i)*STDs(1).*CORs(i))
.
This suggests that the program does not check the quality of sdev, crmsd and ccoef
despite we set the keyword checkStats='on'
.
After changing the sentence if not index
to if len(index[0]) != 0:
, the keywrod checkStats works indeed and it comes up with an error You must have: CRMSDs - sqrt(STDs.^2 + STDs[0]^2 - 2*STDs*STDs[0].*CORs) = 0 !
if we give some random values to sdev, crmsd, ccoef.
Thanks for your contribution.
You are correct. This is a bug that I've now fixed by using the condition if np.any(index):
I think I found an issue in 'check_taylor_stats.py' and I will try to describe it in the following.
I am using numpy version 1.18.5 and python 3.6.9 64-bit. The version of 'check_taylor_stats' I used is copied at the bottom.
I want to raise the error of incompatible data, so I define data that is clearly not fulfilling
RMSs(i) = sqrt(STDs(i).^2 + STDs(1)^2 - 2*STDs(i)*STDs(1).*CORs(i)).
So if I define
STDs = [1,5,3,2]
CRMSDs = [1,7,6,4]
CORs = [6,8,3,1]
, which are some random values in a list, then my 'difference' list is
array([103., 44., 15])
So we get differences much above our threshold of 0.01. But as far as I understand it, we want the difference to be close to zero, so that we can be sure that
RMSs(i) = sqrt(STDs(i).^2 + STDs(1)^2 - 2*STDs(i)*STDs(1).*CORs(i))
is true.Next, the 'index' for these differences is
(array([0, 1, 2]),)
, which is clearly not empty.
But the function will only raise an error 'if not index', so if the 'index' is an empty list. So it doesn't raise an error for my random data. Am I overlooking something?
In the other direction, if my differences are very small, so that I can be sure that the data is correct, for example with:
STDs = [1.0, 1.4325136021271947, 1.0617690033008276, 1.4678603104077521, 1.1292483168466108, 1.9391992365319153, 0.7424689187296046, 1.1904070997742
92]CRMSDs = [0.25735399231579914, 0.4455591033096046, 0.5688032547358672, 0.8937130204486814, 0.6372916529557612, 1.3136761418162093, 0.6242831508114802, 0.800117230131112]
CORs = [1.0, 0.9960018185865716, 0.8494391285508488, 0.8024915283990488, 0.8275686942766974, 0.7824746978593649, 0.7822082605692278, 0.7463335364755743]
I get the differences
array([3.63507139e-15, 8.57879111e-16, 1.66799715e-15, 2.32355360e-15, 7.71995646e-16, 4.27305282e-16, 8.67107592e-16])
which are all well below the threshold.
But now I think this is part of the problem; the variable 'index' looks like this
(array([], dtype=int64),)
which also doesn't count as empty, because of its shape, so 'if not index' also is not the case. So I am not able to raise the error so far. I can only raise it if I define
index = []
I would change
if not index
toif len(index[0]) != 0:
, but I see that this has also shortcomings, if somehow 'index' even so is a list and not an array.But for now, it will raise an error for me if one of the differences is above the threshold.
The function 'check_taylor_stats.py' I used: