Closed jstac closed 1 year ago
@maanasee , I have updated this issue to give a bit more detail.
Would you like to collaborate with @Smit-create and @HumphreyYang on this lecture? It's a little tricky to figure out what to do. Or you can start it yourself and then show it to them?
@maanasee , thanks for your nice work on this problem. Please put together a pull request with what you have so far and state in the PR that it fixes this issue.
We motivate as follows:
The task is to estimate the revenue raised by a wealth tax. The proposed wealth tax is $h(w)$, where $h$ is some function and $w$ is wealth.
With $n$ as the population size, total revenue is
$$ T = \sum_{i=1}^n h(w_i) $$
The problem is that wealth is not observed for all individuals. We only have a sample $w_1, \ldots, w_m$ from $m$ individuals.
One idea for calculating revenue is as follows.
We assume that weath of each individual is a draw from a distribution with density $f$.
We estimate $f$ and then approximate $T$ via
$$ T = \sum_{i=1}^n h(wi) = n \frac{1}{n} \sum{i=1}^n h(w_i) \approx n \int_0^\infty h(w) f(w) dw $$
If we work with log wealth $\hat w_i = \ln w_i$, with $\hat f$ being the density of $\hat w_i$, then we could calculate as
$$ T = \sum_{i=1}^n h( \exp(\ln(wi))) = n \frac{1}{n} \sum{i=1}^n h(\exp(\hat w_i )) \approx = n \int_0^\infty h(\exp(\hat w)) \hat f(w) dw $$
However, perhaps it's best to start with the first case (estimate $f$) rather than the second case (estimate $\hat f$) to begin with, since it's easier for students to understand.
The difficulty is: how to choose and estimate $f$?
Now we step through a sequence of density classes, in each case estimating $f$ via max likelihood.
I suggest
In both cases, the wikipedia page shows the maximum likelihood estimates of the parameters.
We will emphasize that different choices of $f$ lead to different values of $T$.
A key message is that we need to model wealth as heavy tailed, since light tails and heavy tails will get very different numbers.
Optionally, we can add in nonparametric kernel density estimation.