PSLmodels / taxdata

The TaxData project prepares microdata for use with the Tax-Calculator microsimulation project.
http://pslmodels.github.io/taxdata/
Other
19 stars 30 forks source link

Asset, wealth and capital imputations #322

Open codykallen opened 5 years ago

codykallen commented 5 years ago

Given a serious focus by candidates on the taxation of various types of wealth, I believe we need to impute detailed information on assets, liabilities, and capital transactions.

I believe we should first consider how much information we need to impute, in order to accurately or reasonably model policy proposals, at least on a static basis. Such proposals could include a tax on net wealth, a property tax on land and real estate, changing the classification of short-term and long-term capital gains, indexing capital gains to inflation, subsidies or additional tax breaks based on student loan debt, changes in mortgage interest deductibility, and changes to tax-preferred savings accounts.

To actually model these, I believe we would need to impute data based on two sources. The first source, for actual wealth, asset and liability information, should come from the Survey of Consumer Finances, which has detailed information on different types of assets and liabilities. I believe we would need to impute at least a dozen variables, and possibly more.

However, the SCF has some unusual statistical properties/issues. The dataset only consists of about 6,000 households, and 5 observations per household. The multiple observations occur because the SCF imputes missing information, and accordingly produces multiple observations for each household. And, not surprisingly, there are some differences in reported incomes according to the SCF and according to the IRS.

Regarding imputation from the SCF, I think the following publications may be helpful:

For proposals related to capital gains, I don't think we have any usable microdata. Instead, we may have to impute the basis and holding time for capital assets sold (for those with capital gain or loss), using the IRS tables for the Sales of Capital Assets.

Before doing the imputations of potentially many variables, I think it would be useful to discuss potential approaches to it, figure out which variables we need to impute, and establish a mechanism for imputing any further variables from these datasets in the future.

@MaxGhenis @donboyd5 @MattHJensen @martinholmer @andersonfrailey @hdoupe

andersonfrailey commented 5 years ago

Thanks for bringing this up, @codykallen. I agree that this would be a valuable addition to TaxData.

Before doing the imputations of potentially many variables, I think it would be useful to discuss potential approaches to it, figure out which variables we need to impute, and establish a mechanism for imputing any further variables from these datasets in the future.

These all seem like reasonable first steps. I'll have our intern put together a summary of what the candidates have proposed so far with regards to a wealth tax to start the process fo figuring out what we might want to focus on imputing first.

codykallen commented 5 years ago

@andersonfrailey said:

I'll have our intern put together a summary of what the candidates have proposed so far with regards to a wealth tax

For distributing the corporate income tax, I would like several important variables and one potentially useful one (the last on the list).

All of these are either in the SCF or can be computed from SCF variables. It may also be useful to get some data on their ownership of a business (in particular, pass-throughs), but this may be of less importance.