jmbejara / comp-econ-sp18

Main Course Repository for Computational Methods in Economics (Econ 21410, Spring 2018)
16 stars 23 forks source link

Typos in HW 4 #39

Closed jmbejara closed 6 years ago

jmbejara commented 6 years ago

Here is a list of known typos and other improvements to make in HW 4. I haven't corrected these particular ones yet. If anyone notices any other typos, please let me know here.

jmbejara commented 6 years ago
jmbejara commented 6 years ago
jmbejara commented 6 years ago
jmbejara commented 6 years ago

This is how I am subsetting based on GQ. I don't include GQ = 0 because the codebook says something about 0 being NIU. My comment in the HW should reflect this.

# GQ = 0 for vacant units, 1 for Households, 2 for group quarters
df = df[df.GQ == 1]
afgong commented 6 years ago

In Q15, do you mean compute three correlations? So the correlation between ave_wages and median_wages, ave_wages and employment, and median_wages and employment?

jmbejara commented 6 years ago

There will be a matrix of correlations. The entries of the matrix will have the correlations between each combination of pairs. There is a single command for this.

afgong commented 6 years ago

In Q20, are we looking to space the bins like [25, 30, 35, 40, 45, 50, 55]? Also, is educ_bins supposed to correspond to the codebook values? So educ_bins should be a list of 5 elements?

jmbejara commented 6 years ago
afgong commented 6 years ago

screen shot 2018-05-02 at 12 27 27

Is the graph from Q16 supposed to look something like this?

afgong commented 6 years ago

Also, for Q22, how do you remove the average_wage above the Bachelors_Degree, so that in the heatmap in the following question, it doesn't look like this:

screen shot 2018-05-02 at 13 09 07

screen shot 2018-05-02 at 13 11 37

jmbejara commented 6 years ago

This is what mine looks like:

q16

This might help. Here I have run df.describe() at various points.

At Q7: q7

Before Q11: beforeq11

jmbejara commented 6 years ago

With respect to multiindexing, you can do this:

q21_1

q21_2

jmbejara commented 6 years ago

To change the order of the columns, I am doing it manually like this:

image

afgong commented 6 years ago

Yea, our summary statistics are diverging very mildly at Q7...will go back and double check what's going on. This is what I have right now from Q4 and Q5, respectively:

screen shot 2018-05-02 at 13 28 13

screen shot 2018-05-02 at 13 28 28

jmbejara commented 6 years ago

Everything here looks good to me. hmm. I don't know. What are you getting for df.describe() at the point at which it diverges?

afgong commented 6 years ago

Q7:

screen shot 2018-05-02 at 13 56 17

jmbejara commented 6 years ago

I think you need to rerun your code from the beginning. My Q7 real_wage max is much larger. It looks like you have already dropped the observations described in Q10 at this point.

Jacob-Bishop commented 6 years ago

What is the employment variable supposed to measure? My assumption was LABFORCE, but we dropped that variable earlier on in the code.

jmbejara commented 6 years ago

employment was created from the variable in_labor_force. This is because LABFORCE was a variable equal to 0,1, or 2. The variable we created was True or False (1 or 0).

Jacob-Bishop commented 6 years ago

Okay. Do you want it to be the average (fraction employed) or the sum (total employed?).

afgong commented 6 years ago

@jmbejara hmm, I don't know what's going on. At which question(s) did you drop the missing values? I dropped (df.dropna(axis=0, how='any')) at the end of my code at Q4 OR at the beginning of my code at Q5.

jmbejara commented 6 years ago

@Jacob-Bishop I was looking for the fraction. Also, be sure to take a weighted average.

jmbejara commented 6 years ago

@afgong Sorry for this confusion. I have updated my code so that it only drops missing values at the specific points where I say to drop them in the problem descriptions. This changes my Q7 describe to the following: q7_df_describe

At this point, I only drop rows at the end of Q6, calling df = df.dropna()

Sorry about this. If your answers look reasonably close, I wouldn't worry to much about this. I've instructed Philip to be generous with the grading in this regard. (Also, it's been interesting to me how little things like this can make replication so challenging.)

afgong commented 6 years ago

Thank you so much!!!