UBC-MDS / data-analysis-review-2021

1 stars 4 forks source link

Submission: GROUP 30: Stock Search Trend & Return Volatility Association Analysis #27

Open BooleanJulien opened 2 years ago

BooleanJulien commented 2 years ago

Submitting authors: Amir Shojakhani, Helin Wang, Julien Gordon

Repository: https://github.com/UBC-MDS/Stock-Price-Trend-Volatility-Analysis Report link: https://github.com/UBC-MDS/Stock-Price-Trend-Volatility-Analysis/blob/main/doc/Stock_Price_Trend_Volatility_Analysis_report.md Abstract/executive summary:

Investment firms are increasingly looking to data science and unusual data sources to provide informational advantages to bolster their portfolio strategies. In this project, we are investigating whether Google Trends data on stock ticker names can provide insight into return volatility**. Investors are often interested in understanding the volatility of stock returns. Some financial derivative trading strategies try to take advantage of changes in a stocks' volatility, as certain options are sensitive to changes in implied volatility. See a primer on option vega if you are interested! https://www.investopedia.com/terms/v/vega.asp

Consider this project a screening exercise for whether Google Trends could be useful in volatility-based trading strategies.

In order to assess the association between stock return volatility and search trend volatility, we analyse the standard deviation of weekly search trends and weekly returns for over 300 stocks in the S&P 500 over a one-year period from July 2020 to July 2021. We conduct a simple linear regression with a confidence level of 0.95 with the return volatility as the dependent variable and search trends volatility as the independent variable. Our null hypothesis is that there is no association between the two volatilities, with the alternative being that there is an association.

Ultimately, we find a significant coefficient of trend volatility and reject the null hypothesis in favour of the alternative. The R^2 value indicates that our simple model is explaining very little of the variation in return volatility. Moreover, the effect size seems to be fairly small in relation to the range of return volatility that we observe in the data. These caveats are to be expected considering we are using a very simple model to understand markets which contain lots of complexity. Nonetheless, this positive result is exciting and warrants future investigation into the use of Google Trends for Financial Analysis.

**Note that in statistical terms, the volatility is simply the standard deviation of returns. https://www.investopedia.com/terms/v/volatility.asp

Editor: @flor14 Reviewer: Steven Lio, Chaoron Wang, Wenjia Zhu, & Nico Van den Hooff

nicovandenhooff commented 2 years ago

Data analysis review checklist

Reviewer: @nicovandenhooff

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

Analysis report

Estimated hours spent reviewing:

45 minutes

Review Comments:

Overall I really liked your project, here are some things I specifically liked:

A couple of minor suggestions below:

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.

showcy commented 2 years ago

Data analysis review checklist

Reviewer: @showcy

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

Analysis report

Estimated hours spent reviewing:

30 minutes

Review Comments:

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.

stevenlio88 commented 2 years ago

Reviewer: @stevenlio88

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

Analysis report

Estimated hours spent reviewing:

30 minutes

Review Comments:

Overall the report is very well written and it is clear and in a good logical flow. Explanations are very thorough. Regarding the repository, the data folder should contain the raw/processed data for the analysis. But result tables were also included in the same said folder which was supposed to be included in the results folder. The instructions for running the necessary scripts are using generic variables (instead of relative path, actual data file name). The visualization in the final report can be improved by increasing the font size, margin, and axis ranges.

Recommendations on model:

It would be interesting to explore some more in-depth models and explore time series data analysis, correlated time series analysis, etc. A plot of the time-series data + the predicted value could be useful to be looked at to assess model performance. Also, the time series may experience a delay effect (search first then the price goes volatile or the price goes volatile cause of some news then searches) this may contribute to some delay effect or any seasonal effect may violate the linear assumption in the model used.

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.

PANDASANG1231 commented 2 years ago

Data analysis review checklist

Reviewer: PANDASANG1231 

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

 

Analysis report

Estimated hours spent reviewing:  45 minutes

Review Comments: 

   -  It is really interesting that you choose this topic, I think doing volatility analysis is extremely useful because the price of options and some stock alpha strategies will be related to volatility.    -  The EDA report and plots are clear and easy to follow. The whole report is in a good structure.

$ $    - The script can be combined somehow so that people can reproduce it easier.    -  I think it will be better if you do more analysis on time series. Because although weekly is a good time period, still it will be better if we see the trend.  For example, how is the relationship when the time window is an hour, 1-day, 5-day, a month? It will be even better if you put time series on the x-axis and show us the trends in one single plot.    -  Also, you can explore how google trends in time $t$ is related to the return in time $t+1$, $t+2$. Like answering a question that is google trend a sign in advance or behind.     -  Although google trends are a good angle to look at this topic, logically google trends might not be a signal in advance regarding investment.     -  I think return volatility is not the final thing people want to know, maybe bridge return volatility to the options price will make the conclusion and insight fancier. And the theory that return volatility is related to option price actually guaranteed it, it is just better to show that to non-tech people. I mean it is a simple and low-risk step but makes the conclusion even better.

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.

BooleanJulien commented 2 years ago

Thank you for all of your comments, review team! We appreciated, agreed with, and implemented many of your comments, but we will highlight a few examples of implementation for the purposes of the assignment deliverables.

From @nicovandenhooff

- I would suggest adding a section for License in your README, it could say something like "The source code for the site is licensed under the MIT license"
- In the data folder, you could further separate these files by /processed and /raw to help with the overall project structure
- Minor but there is are a few files in the repo called to_be_deleted.txt, assuming you can delete these now

Our implementation:

From @stevenlio88

But result tables were also included in the same said folder which was supposed to be included in the results folder. 

Our implementation

Moving regression results to the results folder https://github.com/UBC-MDS/Stock-Price-Trend-Volatility-Analysis/commit/0daad5659a758184d77978ff412f6ef4c7595897

From Eric, our TA

Suggested adding/fixing figure captions

Our implementation

We addressed this in a few commits

https://github.com/UBC-MDS/Stock-Price-Trend-Volatility-Analysis/commit/13df4e70d3eb432d0faac68dc5cb6fe22b3ea353 https://github.com/UBC-MDS/Stock-Price-Trend-Volatility-Analysis/commit/829c7ca0093369b037a950eca0175a98967de4c0 https://github.com/UBC-MDS/Stock-Price-Trend-Volatility-Analysis/commit/13df4e70d3eb432d0faac68dc5cb6fe22b3ea353