Final Peer Review - arn39

This project aims to to predict the value of VIX, a benchmark for the volatility of the stock market. It uses market indices, macroeconomic data, Google trends data and sentiment index data. There is use of times series data (was cautioned against this) and an effort was made to make predictions using it.

The authors seem to make a nice abstract that provides a succinct summary but left a bit wanting for better treatment on how market volatility is important and how this actually improves trading strategy. This would motivate the reader a lot more, for instance an example of where VIX has been used could have helped flesh out their intentions. However, it is a good project with use of models, real world application through describing a trading strategy and good use of data visualization in all parts of the report.

Strengths: The data visualization is good. I like their use of graphs to explain the data. It was good to have an idea of what their data meant since they were using these multiple sources and I liked the 2008 graph that showed the VIX peaking. This helped in motivating the project better. They have provided even more depth in the appendix and it was nice to go through that since they have used good sources. Good choice of models. Apt variety and choice of models used. The feature description and feature engineering section is very good and detailed. Really liked the use of feature engineering and it was very well suited to the data and clean up was a very good application of class knowledge on leveraging messy data.

Weaknesses: I know that the S&P 500 and Dow Jones Industrial Average are used as indicators of the economy but these are only the top 500 firms. Do they really capture the volatility of the entire stock market of the United States? I would have liked to see some more inclusion of the rest of Nasdaq or the hypothesis could have been revised to include this consideration. This felt like a slight disconnect. I would have liked more depth in their evaluation of the results. While there is depth in the pre processing and processing stages there is very less analysis in the results section. What do the results of each of all the models mean and why do you hypothesize you got these scores. While the techniques used are very useful, It would have been nice to see more detail on which one was the most impactful. More information on what they have done to combat overfitting would have been good to elucidate. I’m also a bit uncomfortable with the use of Random Forest for Time Series Prediction since it’s more of a linear classifier. There is a little oversimplification for something as complex as VIX. Report could be formatted better. Grammatical errors could be fixed.

FelixWenTHU / VixPrediction

Final Peer Review - arn39 #14