This project originally desired to study companies recently hacked, otherwise made vulnerable. However, as the course progressed, the analysis mainly focused on the general portfolio. Furthermore, the data covers a span less than 1.5 years, a constraint of the original data and original project aspiration:
Specifically, the above data was merged with historical stock prices. To reduce development, corresponding stock data was collected locally. However, an untested feature was coded, allowing stock prices to be collected from quandl.
The dashboard shows the overall variance for selected companies within the portfolio. Since variance is a measure of risk, the smallest overall variance is preferred for less risk-averse investors:
However, if the general time series displays a pattern of seasonality, and a model can be trained with good predictive abilities, then high volatility provides an investment opportunity.
Some exploratory analysis was conducted on individual company stock. Specifically, timeseries plots were made, as well as autocorrelation function (ACF), and partial autocorrelation function (PACF) plots. However, later analysis focused on the collective portfolio, rather than individual timeseries. An overall decomposed time series was generated:
The decomposition consists of the following components:
If more time were to be allocated to this project, an overall ACF and PACF would be computed, and would determine autoregression (AR), and the moving average (MA) components to the below Arima model.
A general pareto distribution (GPD) was computed as a risk measure for the overall portfolio. Though some components were visually minimized, the GPD was computed for the overall opening, closing, and general volume. Moreover, the value at risk (VaR) is a measure of potential loss for a given portfolio, while the expected shortfall (ES) is the average of all losses greater than the VaR. Both measures, are provided with the below GPD distributions:
Since this project made some great simplifications, the portfolio was equally distributed (one share) among the selected stocks. Therefore, corresponding risk measures are significantly small.
Note: the user-interface allows different segments to be toggled. Additionally, content on the above VAR was borrowed from Professor Damodaran, from the Stern School of Business at New York University.
A general efficient frontier was created, along with the tangent markowitz model to signify the most efficient portfolio. Moreover, individual stocks were also plotted:
A general arima model was computed for the overall portfolio:
A stationarity test using the augmented dickey fuller test was implemented. Moreover, ACF and PACF measures provide suggestive values for the AR and MA arguments as an approach to reduce seasonal patterns. Furthermore, a general mean squared error (MSE) was computed to allow comparison with the below recurrent neural network.
A long-short-term-memory (LSTM) recurrent neural network was created: