Improve sales validation heuristics

dfsnow commented 1 year ago

Outline

So far, work in this repository as focused on functionalizing the sales validation code and building infrastructure to support it. Now, we need to revisit and improve the sales validation heuristics themselves. This will involve a lot of EDA to try to discover sales that are obviously non-arms-length, as well as the heuristic/statistical methods to flag them. We can also confer with Valuation analysts to develop heuristics.

Suggestions

I would start by looking at the count by outlier type of the current output. For outlier types that have a very low count, we should investigate why. It could be that there are genuinely very few non-arms-length sales of that type, but it is much more likely that we simply need to adjust the thresholds associated with the heuristic. Altering the regex for family/institutional flagging would be another easy place to start.

Some other suggestions:

Low thresholds for price swings (they seem very high to me currently)
Revisit regex for family/institutional sales
Try combining some heuristics with the PTAX203 flags
Featurize and use other inputs (e.g. recent char update)

We can also add brand new heuristics if we find any that are both appropriate and powerful enough.

Damonamajor commented 1 year ago

@dfsnow @wagnerlmichael @ccao-jardine

Chart & Descriptions

Flag SD Chart.xlsx

Attached is an excel file with two charts. It demonstrates the impact that reducing the standard deviations would have on our sale validation process. In particular, it sheds light on the hypothesis that the current heuristics are too conservative.

The first table counts the flags for each type based on different heuristics. The second table, (that I recommend reviewing), documents the difference from the original.

The columns are grouped in four categories:

Base Model with an upper bound of 3 and a lower bound of 2 for our internal flagging, and an upper bound of 1 and a lower bound of 1 for p-tax flags.
Changes to the upper bound for our internal flagging.
Changes to lower bound for our internal flagging.
Changes to both the upper and lower bound for the p-tax flags.

Rows are ordered by decreasing count for our current heuristics.

Takeaways

The changes in heuristics operate as expected. When standard deviations are decreased, there is an increase in flagged observations for the desired outputs. For example, decreasing the upper bound from 3 standard deviations to 2 standard deviations increases the non-person sale (high) from 799 -> 2068. This has no impact on non-person sale (low).
Decreasing the upper bound from 3 to 2 (equal to our current lower bound) increases the total flagged observations by 171% (3074 -> 8316). -- This resonates with what Michael said about the skew being shifted to the right. -- I would be interested in mapping these observations next week, to see if they are in particular neighborhoods.
Our flags for high value sales drastically outnumber our low value sales (2134 to 940). This exists even in our current setup where the high SD is 3 and the low SD is 2 (representing an existing skew).
Non-person sales are our most successful internal heuristic. It would be interesting to subset this into the possible categories of: -- non-person -> person -- person -> non-person -- non-person -> non-person.
There is very little overlap between our flagging and the p-tax flagging. Increasing the standard deviations to 30 for p-tax (and thus eliminating all observations) only increases our flagged observations by 313. This means that the heuristic is successfully identifying a new type of arms-length-sale, rather than expanding on an existing heuristic. The same process is seen when the SD is reduced to .1.
Our family sale heuristic does not appear to be particularly useful. -- This may relate to very low value sales already being excluded from the dataset. -- Or it could be because it is last/first in the heuristic process. This is not urgent, but it would be interesting to see how this chart is reproduced when all relevant flags would be counted for a single property. At the moment, they are simply categorical, based on sv_outlier_type. Thus, if the family sale is being overwritten by other flags, the impact is not fully seen.

dfsnow commented 1 year ago

This is excellent, thanks @Damonamajor. Please attach any other findings to this issue.

We're putting this on hold for the time being in order to get an export ready for sending to iasWorld. It's possible we will revisit this issue before the end of the year or early next year.

Damonamajor commented 1 year ago

Below is a link to OneDrive with two maps, and one excel file. The included README provides a brief description and sums up my takeaways.

Concluding Thoughts

@wagnerlmichael @ccao-jardine @dfsnow

dfsnow commented 1 year ago

Adding that we should include Question 9 from the PTAX-203 output to the possible PTAX flags.

dfsnow commented 7 months ago

@Damonamajor @wagnerlmichael Backburner for now, but I'd like to revisit this later in the year before modeling for 2025.

ccao-data / model-sales-val