Final Report Review by Liyun Wang

Our project is very similar as yours. However, the dataset you have is messier and the number of features for each house is limited.

The group has put a lot of effort in cleaning the data set and dealing with wrong data points. I like the idea that you used histogram to show the problems within the dataset. The problems are hard to detect by just looking at the data table, a histogram would highlight issues within the data table directly. I really appreciate that you did research before setting a standard for dataset cleaning. It is very professional of your group to back up your decision with research and evidence.

During the model fitting phase, you plotted a graph with predicted price/actual price for each model. I really like the way your present the result of each model. It not only tells me how the model fits the data in general but also shows whether the model tend to over estimate or under estimate housing price. From the graphs showed, it is clear that Huber Loss gives the most accurate and balanced prediction of all the models.

There is only one thing I want to point out. The result shows that the model predicts well but I am sure how well the model generalize. Maybe you can perform a cross validation step in the future.

AaronZang / ORIE4741-Home-Purchase-Assistance

Final Report Review by Liyun Wang #11