SuperstonkQuants / requests

A place to submit requests to the SuperStonkQuant team.
7 stars 0 forks source link

Tracking invisible hands, the search for CSOs #9

Open LeCyador opened 3 years ago

LeCyador commented 3 years ago

Link to your reddit username LeCyador

Propose your question or request My question is to challenge my hypothesis that the following is correct. Synthetic CDOs (collateralized Debt obligations), from here on in labelled as CSOs (collateralized synthetic obligations), exist and move the market in heretofore unseen ways. Specifically, the price movement of a stock with sufficient representation in a large number of CSOs may cause the market to move previously unrelated stocks in an effort to price this change into the CSO.

I would like to use Latent Variable Analysis on the already compiled data from across the "meme" stocks. I would like to format the analysis with a different number of variables representing currently present swaps in the marketplace (plain CDOs), with the synthetic swaps (CSOs), and see if this could create insight into knowing how many swaps might be moving in the background and affecting the price. We have knowledge on the price vs time, and volume vs time. As far as variables that we have a solid grasp on, I think this is it, as this data must be pushed to the tape. Perhaps the options contracts traded could also be used. The swaps trades, CDOs, and CSOs will all be kind of a shot in the dark. Which this query is looking to model.

I would suggest using Structural Equation Modelling in order to make this feasible. I have a couple ideas on what this equation MIGHT look like, but a large portion of this would be modelling and estimation based. As a result, we will be unlikely to have anything verify the likelihood of these entangled stocks, but we may be able to make a forecasting model of sorts, where a move by specifically $GME in this case causes a move across the CSO.

We could then test the model by putting test data in and seeing how it reacts to the real world data that we observe. I know there are also some people searching for the swaps data itself, and if we had that, we could test the model vs the swaps data (if and when we get it), to validate the model.

Describe alternatives you've considered I have tried to find data on the swaps themselves, their existence, or their availability, but I did not have access to the data. I went through the Form X-17A-5 Citadel Securities Llc https://sec.report/Document/0001616344-21-000004/ searching for them, but did not find anything that would suggest these exist except the large number of "Securities sold, not yet purchased, at fair value" which could represent the contents of these swaps. I also examined https://www.cftc.gov/sites/default/files/idc/groups/public/@lrfederalregister/documents/file/2012-18003a.pdf looking for the definition of these swaps and trying to get a firm legal handle on exactly what they could be.

I thought about using some Kalman Filtering Techniques to look at removing data that is spurious, but as we don't know exactly which "sensor data" is acceptable, looking at the Latent Variable Analysis could be a better approach.

Additional context As the superstonk quants have already garnered a lot of the data on related stocks and searched for correlation, I think that same data could be used to find this underlying pattern. I do realize that this will be difficult, but it could help provide some estimates on the "in the darkness" problems we are looking at, similar to 2008 with the synthetic CDOs that were the biggest issue there. I am present in the discord, and can be easily pinged through my username. A quick intro to SEM can be found here: http://faculty.cas.usf.edu/mbrannick/regression/SEM.html#:~:text=The%20main%20difference%20between%20the%20two%20types%20of,%28a.k.a.%20cognitive%20ability%29%2C%20Type%20A%20personality%2C%20and%20depression.

The first main challenge I see will be creating the equation, then applying the data gathered to test the equation, then probably a lot of massaging of the equation until it fits in a way that is representative. This will likely be an iterative process, with multiple equations being thrown by the wayside as they don't fully encapsulate the relationships that exist.

Quick Summary Utilizing Latent Variable Analysis with Structural Equation Modelling to find evidence of CSOs effecting the price of baskets of stocks across the market.