Final Report Peer Review - Siyao Gu (sg2238)

By: Siyao Gu, sg2238 The project aims at studying how sentiment can affect the market price. The group first made use of Loughran and McDonald’s dictionary to do web scrapping by Natural Language Processing on Thompson Reuters’ historical real-time news. Then they calculated the sentiment score for different companies using the result from NLP, linear regression, n-fold cross validation, Lasso regression and Ridge regression with data from Fama French Database. Finally, they developed trading strategy on asset prices from WRDS TAQ and WRDS CRSP database by different predicted sentiment score, hedged the trade by S&P 500 futures and compared the revenue growth in 8 months. The result shows that the sentiment-based trading works, while the value of return remains doubtful since it depends on assumptions made.

Three things I like about the report: • The group’s topic is worthy of studying. News-driven strategy has been popular these years in finance industry, and one thing that makes it continuously popular through years is that many issues have not been properly-solved in terms of the strategy. The group mentioned the time that the strategy uses to generate a solid result, which definitely has a huge influence on the efficiency of the strategy. According to the report, the group actually were well-prepared for what they would implement in their research. They used a relatively “quick” news resource, Thompson Reuters, they used a well-built word dictionary and they cut the time length of their algorithm within 1 to 5 minutes. That said, the group had kept the idea of gaining a quick strategy in mind and got a pretty result in the end.

• The group implemented CAPM model to capture market beta as well as abnormal returns before creating the trading strategy. This is a commonly approach to take in the financial literature, while it is definitely a helpful and necessary way to go before one further devote itself into the real trading algorithm. It is great and making sense to see that the distribution of abnormal returns is bell-shaped with mean/median close to 0, which tested the robust of the market prices in the data and would potentially confirmed the reliability of the final results.

• The group realized the huge amount of the features that would be used and implemented Lasso and Ridge regression to improve the result. They mentioned earlier in the report that the 2690 words in the word dictionary as features could cause overfitting even though they did not have to care about overfitting in this project. And in the end of the report, they tried to penalize the large coefficients and somehow cut the number of features using Lasso and Ridge regression, which brought them a better return. The logic and connection was smooth and did make sense.

Three things that may need improvement: • The decision of benchmark. The group mentioned that they would use cross validation as criteria to select the proper model, while the result was not shown in the report. It seems that the group was trying to compare the return of different algorithm and treat the ones with higher returns as a better model. However, as stated in the end of the report, the reliability of those returns remained confusing, indicating that we could not choose model simply by the high/low of the returns.

• The returns might not be correct in reality. Among all the issues the group mentioned that could possibly impact the returns, the transaction costs could be the largest one. Although the algorithm gets unbalanced buying and selling signals, it is still a high-frequency trading strategy. Also, the group decided to equally balance the weights on different assets with 1 share per company when they tested the algorithm, and that could include huge amount of transaction costs. That said, 787% or 500% returns might not be the case in real settings.

• The unlimited wealth assumption. The group actually did not mention this in the final report, while according to what I saw, they seem to do trades based on the assumption that they held unlimited wealth, and I believe that is the reason why there could be unbalanced buying and selling signals when in each trade, only 1 share of the company’s stock could be traded. That said, the unlimited wealth that investors held would impact the market prices as well. Specifically, the amount of money they held would change the market expectations, and the result would be different than what the report showed.

TSL-123 / SentimentDrivenStrategy

Final Report Peer Review - Siyao Gu (sg2238) #17