jmbejara / comp-econ-sp19

Main Course Repository for Computational Methods in Economics (Econ 21410, Spring 2019)
48 stars 26 forks source link

Hw 5, Question 10 and 11 #42

Closed erineidschun closed 5 years ago

erineidschun commented 5 years ago

For #10, I am getting a compile error, " invalid callable given" pointing to the integrate.quad lines. My code is:

f = pdf(x, mu_MLE, sig_MLE) integrate.quad(f, 100000, np.inf) integrate.quad(f, -np.inf, 75000)

where x= np.linspace(.001, 150_000, 100) (Question 3) mu_MLE and sig_MLE are generated from optimize.minimize pdf is given by:

def pdf(xvals, mu, sigma): pdf_vals= (1/(xvals sigma np.sqrt(2 * np.pi))

for #11, I am also getting a compile error, "too many values to unpack (expected 2)" pointing to the results = opt.minimize function:

initial guesses

beta_0 = 12
beta_age = 0.8 # sig_2 beta_children = 1 beta_temp = 2 sigma = 0.5 params_init2 = np.array([beta_0, beta_age, beta_children, beta_temp, sigma]) mle_args2 = df2[df2.columns[0]] #gets the sick column results = opt.minimize(crit, params_init2, args=(mle_args2)) beta_0_MLE, beta_age_MLE, beta_children_MLE, beta_temp_MLE, sigma_MLE = results.x

Thanks for your help!

jmbejara commented 5 years ago

For your first question, the problem is that f is not a "callable". This means that it you're not providing the integration function with a function. Try using lambda to create quickly make a function to pass to the integration function. https://www.w3schools.com/python/python_lambda.asp

jmbejara commented 5 years ago

For the second one, print out results.x on its own to see what comes out. It should be an array. Apparently, it's an array with only two elements. Though, as your code suggests, it should be outputing values for 5 parameters: beta_0_MLE, beta_age_MLE, beta_children_MLE, beta_temp_MLE, sigma_MLE My guess is that the crit function must be incorrectly specified.

Lemme know if that helps!

erineidschun commented 5 years ago

I am still not understanding #10:

f = lambda x, mu_MLE, sig_MLE: pdf(x, mu_MLE, sig_MLE) result = integrate.quad(f, 100000, np.inf,args=(x, mu_MLE, sig_MLE)) result

The first line I wrote seems pretty redundant, I'm not sure exactly what to write here. I get a compile error pointing to the integrate.quad line saying that "pdf() takes 3 positional arguments but 4 were given", which I don't see, because I only specified the arguments (x, mu_MLE, sig_MLE). When I remove x from args, I get results = (0.22986684130033777, 2.8432054363305173e-10), which i'm not even sure how to interpret, as I have two values and not one.

erineidschun commented 5 years ago

For #11, I can't output results.x because this is exactly the line where I get an error. My code for crit is exactly like yours you wrote in class, except that I've taken out args because we no longer have the cutoff variable.

def crit(params, xvals): mu, sigma = params log_lik_val = log_lik_norm(xvals, mu, sigma) neg_log_lik_val = -log_lik_val return neg_log_lik_val

However, I think I have to create a different crit function because we now have beta 0 - beta 3 and sigma, not just mu and sigma. I create crit2, log_lik_norm2, and pdf2. I have no idea how to create the pdf2 function, as we no longer have a mu. This makes me think my approach is completely wrong, and I'm not sure what to do:

def crit2(params, xvals): b0, b1, b2, b3, sigma = params log_lik_val = log_lik_norm2(xvals, b0, b1, b2, b3, sigma) neg_log_lik_val = -log_lik_val return neg_log_lik_val

def log_lik_norm2(xvals, b0, b1, b2, b3, sigma): pdf_vals = pdf2(xvals, b0, b1, b2, b3, sigma) ln_pdf_vals = np.log(pdf_vals) log_lik_val = ln_pdf_vals.sum() return log_lik_val

No idea what to do for pdf2, because I don't have a mu

def pdf2(xvals, b0, b1, b2, b3, sigma): pdf_vals= (1/(xvals sigma np.sqrt(2 * np.pi))

jmbejara commented 5 years ago

In the question, you'll need to integrate the PDF. f should be the PDF.

When I remove x from args, I get results = (0.22986684130033777, 2.8432054363305173e-10), which i'm not even sure how to interpret, as I have two values and not one.

The two results are the integral and an estimate of the error. From the documentation (https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.quad.html), we see that the function returns this the integral and

abserr : float An estimate of the absolute error in the result.

jmbejara commented 5 years ago

However, I think I have to create a different crit function because we now have beta 0 - beta 3 and sigma, not just mu and sigma.

Yep! That's right. The key is that the epsilon in the regression equation is distributed with a normal distribution and its mean is assumed to be zero.

Also, make sure to use a normal distribution instead of a log-normal distribution.

erineidschun commented 5 years ago

For #10, I get a value for result = integrate.quad(pdf, 100000, np.inf,args=(mu_MLE, sig_MLE))

but nan values when I do integrate.quad(pdf, -np.inf, 75000,args=(mu_MLE, sig_MLE)). It seems to be "The occurrence of roundoff error is detected, which prevents the requested tolerance from being achieved. The error may be underestimated. warnings.warn(msg, IntegrationWarning)"

Why is this the case? Also I still don't understand why I do not need to specify "x" in the args argument.

jmbejara commented 5 years ago

Regarding Q10:

With computational problems like this, you don't want to use np.inf. The computer won't be able to handle it correctly. Use a really small number instead. In this problem, however, the log-normal distribution has support on the open interval from 0 to infinity. So, the integral should start from zero.

erineidschun commented 5 years ago

Thank you!

henryli78 commented 5 years ago

If I can jump in and ask another question about 11:

I think I got a form of the PDF and wrote suitable crit functions. However, I'm having trouble with my initialization because I keep running into divide-by-zero errors (because np.exp(-epsilon**2)) seems to go very quickly to 0.

I'm trying to use beta0 = beta1 = beta2 = beta3 = 0 and sigma = 1, and my convergence is at beta0_MLE= 0.25164632214643556 beta1_MLE= 0.012933354316487875 beta2_MLE= 0.4005019928700137 beta3_MLE= -0.009991674836210234 sig_MLE= 0.004267630534263941, but I always get this error,

Users/HenryLaptop/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:10: >RuntimeWarning: divide by zero encountered in log # Remove the CWD from sys.path while we load stuff. /Users/HenryLaptop/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:10: >RuntimeWarning: divide by zero encountered in log # Remove the CWD from sys.path while we load stuff.

and when I call the optimization object, results, I see that

message: 'Desired error not necessarily achieved due to precision loss.' ... success: False

Any ideas?

jmbejara commented 5 years ago

@henryli78 You can try different optimization algorithms. You can see the available methods here: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html Some of these methods require you to input a function for the Jacobian (for which I recommend https://github.com/HIPS/autograd to automatically compute it).)

However, the result that you're getting is fine (precision error notwithstanding). You can verify this when you run OLS.