yossi-cohen / preferential-attachment

0 stars 0 forks source link

Accuracy of prediction as function of focus point. #16

Open yossigil opened 3 years ago

yossigil commented 3 years ago

Repeat the experiment in #15, but this time, compute the relative error in prediction vs the Cramer rao bound.

yossi-cohen commented 3 years ago

issue-16

yossigil commented 3 years ago

In the lower values of theta, very close to zero, there is a singularity. The bound goes to zero, as we approach zero in theta.

I presume that focussing in this area will improve the results. But, of course, there is a limit on how much we can improve the results. We can achieve accuracy which is close to zero indefinitely, for example, if the machine accuracy is 10E-23, it should be clear that even if floating point is used, very close to zero, we will fail.

So, the conjecture is that once we focus on, say the interval of 0.1 to 0.3 we will get better results and better prediction for these values.

It may be a good time to continue with the more general setting in which the entire range of theta is mapped to the -1 to 1 interval with the arc tan transformation

yossi-cohen commented 3 years ago

For [0.0, 1.0] I get: issue-16-2

For [0.1, 1.0] I get: issue-16-3

For [0.0, 0.3] I get: issue-16-4

For [0.01, 0.3] (10,000 training samples) I get: issue-16-5

For [0.01, 0.3] (now with 100,000 training samples) I get more or less the same: issue-16-5-1

yossigil commented 3 years ago
  • First, these is the corrected graphs (as you've asked)

Two corrections to make the graphs look clearer:

  1. In the ordinary, non-log graph, the Y-scale is so large, you cannot see the horizontal line. Trim Ymax, perhaps manually, or use two calibers for the Y axis.

  2. In the logarithmic graph, the scale is OK (naturally), but the origin is not. The Y axis does not cross the X axis at the point x=0.

(Try to avoid using Python auto selection of min, max and division. Try to use two scales for the Y axis; it is possible. )

For [0.0, 1.0] I get:

issue-16-2

Due to scale, only the logarithmic graphs are useful in what follows. I ignore the non-log graphs. Further more, a graph with 10,000 points is a sub graph of the graph with 100,000 points, so there is no need to examine the one with 10,000 points if you have a 100,000, points.

For [0.1, 1.0] I get:

issue-16-3

For [0.0, 0.3] I get:

issue-16-4

For [0.01, 0.3] (10,000 training samples) I get:

issue-16-5

For [0.01, 0.3] (now with 100,000 training samples) I get more or less the same:

issue-16-5-1

  • Regarding the range of theta mapped to the -1 to 1 interval with the arc tan transformation, I would first like to talk with you to understand how exactly to perform the experiment (the domain of lognormal is [0, ...])
yossi-cohen commented 3 years ago

After:

For [0.01, 1.0] (1,000 training samples): issue-16_0 01-1

For [0.01, 0.3] (1,000 training samples): issue-16_0 01-0 3

yossigil commented 3 years ago

I observe some points:

So, to answer the question better of this issue:

yossigil commented 3 years ago

https://www.wisdom.weizmann.ac.il/~yonina/YoninaEldar/Info/sing-fim.pdf

yossigil commented 3 years ago

Try this in Wolfram Alpha: there is a singularity of log-normal at the point theta=0; this is to be expected; don't worry about it.

p[t,x] := E^(- 2(Log@x/t)^2) / (x t Sqrt[2 * Pi]) Plot[{p[0.24x],p[0.25,x],p[0.2,x],p[0.3,x],p[0.4],p[0.5,x],p[0.92,x]}, {x, 0.1, 2},Filling->None, PlotRange->Full, PlotLabels->Placed[{0.24,0.25,0.2},Right]]

The singularity means that as theta approaches zero, the distribution is so peaked that all values are near 1. In particular, if you make theta as small as 0.1, the value it computes are very very close to 1. How close? Probably smaller than the machine accuracy.

This means that when theta is 0.1, we have no chance of learning, not because our algorithm is incorrect, but because the accuracy of the underlying machine may fail us.

yossi-cohen commented 3 years ago

You've wrote:

Not withstanding, the results of focus on the smaller range [0,0.3] look buggy to me; as I read it, there is a range in [0.2,0.3] where we consistently beat the CR bound.

I do not see it. From looking at the log scale (right plot), the blue curve is above the CRB

issue-16_0 01-0 3

yossi-cohen commented 3 years ago

You've wrote:

Double check that the CR bound is computed correctly, shouldn't it be proportional to theta^-2? Or, perhaps to theta^-1? The red curve suggests something else

The CR is computed as follows:

n = 256
CR_bound = np.sqrt(2*np.square(true_params)/n)

where true_params are the 1000 theta parameters selected in the range.

for theta = 0.2: CR_bound = sqrt(2*0.2^2/n) = 0.0176 for theta = 0.3: CR_bound = sqrt(2*0.2^2/n) = 0.0265

Looking at the red line of the left plot it seems correct.

issue-16_0 01-0 3

yossi-cohen commented 3 years ago

You've wrote:

Double check that you are using the MSE, and not the MAE; MAE seems to show on the legend?

It neither. It is, as the legend say, abs(error), meaning: np.abs(pred_params - true_params) (a vector of length=1000) (MAE and MSE produce a single number)

yossi-cohen commented 3 years ago

Try this in Wolfram Alpha: there is a singularity of log-normal at the point theta=0; this is to be expected; don't worry about

p[t,x] := E^(- 2(Log@x/t)^2) / (x t Sqrt[2 * Pi]) ...

I'll appreciate your help here (I'm not so familiar with Wolfram Alpha)

That aside, you say:

The singularity means that as theta approaches zero, the distribution is so peaked that all values are near 1. In particular, if >you make theta as small as 0.1, the value it computes are very very close to 1. How close? Probably smaller than the machine >accuracy.

This means that when theta is 0.1, we have no chance of learning, not because our algorithm is incorrect, but because the >accuracy of the underlying machine may fail us.

Here is the output of log-normal for 0.1 (100 observations):

Am I missing something? It doesn't seem to be very close to 1.0

from scipy.stats import lognorm
print( lognorm.rvs(s=0.1, size=100) )
0.9484 0.9406 1.0933 0.9608 1.0525 1.1123 0.9732 0.8586 1.0642 1.0095
 0.9354 0.8667 1.041  0.9297 0.8484 1.0583 0.8145 1.053  0.9049 0.9288
 1.0114 0.9745 1.2416 0.94   1.0392 0.945  1.0864 1.0106 0.8829 1.191
 0.8916 1.0301 0.9635 0.9984 0.883  1.0546 1.0995 1.0282 0.9179 1.1257
 1.0623 1.0214 0.9586 0.8509 1.109  1.045  0.9269 0.9331 0.9369 0.9452
 1.0244 1.0966 1.0511 1.1618 1.0519 1.0157 0.8823 1.1352 0.9891 1.1091
 0.8304 0.9855 1.1583 0.8517 1.12   1.0481 0.9968 1.0782 1.0912 0.9106
 0.9245 1.0486 0.8928 0.9985 0.9715 0.9756 0.9114 0.9802 0.8717 1.073
 1.0991 1.0253 0.9546 0.9529 1.1074 0.9471 0.8943 1.1018 1.0033 0.8268
 0.9466 0.978  0.9371 0.9665 0.9393 0.9242 1.0208 1.009  1.2042 1.0049]

and here is a histogram of those values: lognormal-0 1

Here is the histogram with theta=0.01: lognormal-0 01

and with theta=0.001: lognormal-0 001

yossi-cohen commented 3 years ago

Limit y between 0 and 5 issue-16_0 01-1-limit-5

yossigil commented 3 years ago

In conclusion of this issue: