yossi-cohen / preferential-attachment

0 stars 0 forks source link

Build the 2D universal estimator. #10

Open yossigil opened 3 years ago

yossigil commented 3 years ago

Input: Two (one) dimensional function, that gives the PMF of a univariate distribution. x = f(t1, t2); in Python, this comes as a simple function from Real*Real to Real, case in point, log-normal function, e.g.,

void universal( double(*f)(double, double), data) {
/*define closure (auxiliary function based on f) */ double g(double d1, double d2) {
// G is is only defined in the cube (-pi/2, pi2/2)^2
 return f(tan(d1), tan(d2))
}
Universal1(g) { // Learn d1 and d2, of g, from the data.
//-1. Set U = cube (-pi/2, pi2/2)^2
// 0. Pick the number of sample points; should not be too large...  256 maybe.
// 1. Generate a dense coverage of the cube.
// 2. Generate the data for each point in the sample of the parameter space.
// 3. Apply DNN to find out d1 and d2, for the input data, this may be the final result
// 4. Set the cube to a cube of half the volume.
// 5. forget all that you learned.
// 6. Repeat from step 1, if the volume of the cube is greater than 1/128 of the original cube.
return the estimate found in the last step 3 above.
}

}

Algorithm:

  1. Transform the parameter space to unit cube of dimension n=1,2, e.g., (1+ arctan(t))/2, sigmoid.
  2. Learn the Sigmoid(t1), Sigmoid(t2) (optional: learn t1,t2, not their sigmoid) using the usual statistical method of generating synthetic data.
  3. Compute the variance of the error of learning, it is typically a function of t1, t2, e.g., the error is not the same everywhere.
  4. Compute the fisher information: Log, Derivate, Square (analytically), integrate (numerically) over x, ranging over all reals.
  5. Multiply by n (the number of sample points).
  6. Compute the inverse.
  7. Compare to the actual error from (3)