Open ethanprihar opened 5 years ago
Here are my results:
I have 10 funky graphs with pairwise comparison results. I pushed them in a folder to the main page because it would be silly to put them all here.
Looks great everyone!!! Tomorrow morning I’ll have some model studies pushed
Get Outlook for iOShttps://aka.ms/o0ukef
From: mbarger1 notifications@github.com Sent: Friday, October 18, 2019 8:58:53 PM To: moorea1/DS501_Case2 DS501_Case2@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [EXT] Re: [moorea1/DS501_Case2] Result Tracking (#2)
I have 10 funky graphs with pairwise comparison results. I pushed them in a folder to the main page because it would be silly to put them all here.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmoorea1%2FDS501_Case2%2Fissues%2F2%3Femail_source%3Dnotifications%26email_token%3DAHQBM2LO4XQHMOV4Z5ACV4TQPJLU3A5CNFSM4JCLSMIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBWUIMA%23issuecomment-544031792&data=02%7C01%7Cammoore%40wpi.edu%7C97a2b3a41ea34fd559a508d7542f83e1%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637070435367978321&sdata=JQ2qGWoQwAq636Jo8YptBsxAC%2Bz8T%2FIEPeoq3d66wk0%3D&reserved=0, or unsubscribehttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHQBM2OTLTR7KSSX7IOMJ43QPJLU3ANCNFSM4JCLSMIA&data=02%7C01%7Cammoore%40wpi.edu%7C97a2b3a41ea34fd559a508d7542f83e1%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637070435367978321&sdata=2g0NwLDm8u0l4TusCt8igBxlWTY65jyrIKFzmMhsZn8%3D&reserved=0.
I have the framework for my exploration, where I can take in a folder full of data sets and handle them each individually.
The problem is I don't know what worth exploring: fitting models on some of the columns (Min, Max, Closing) and predicting Volume traded that week?
Maybe predicting next weeks Min, Max Closing given the historical data? (Then looking at the coefficient certainties to make conclusions about the relationship of predictor to label)
Not sure what's interesting. If anyone was inspiration or a guiding idea I can implement and post it
Honestly, both of those ideas sound pretty good. If it isn't too difficult, feel free to do both. If it's tricky, I'd go for predict the volume traded because a naive method for predicting price won't give great results.
This is a heatmap of correlation in weekly % gain in closing price for the 30 DOW. Now I'll find which have the lowest variance and highest mean
these are the highest mean percent increase and the lowest standard deviation of percent increase for the last 10 years. In Stonks, you want something with high mean increase (obviously) and low volatility. I'm sure this is a common measure but we introduce the Quincy Stat ^TM: mean / sd
To find 'HD' , The Home Depot, to maximize this statistic. Comment your concerns or suggestions (or compliments)!
I'll have more results (taking the Dow Inc info out of consideration) tomorrow morning! It's taking longer than I thought to put together.
Still working on it guys. Hey Mia, do you have the code you used to pull the pricing?
https://finance.yahoo.com/quote/DIS/history?p=DIS
We pulled the pricing by manually going to the companies on yelp, clicking "historical pricing" and selecting all available data, weekly. This is whats in the 'data' folder you can download from this git. If you need monthly instead of weekly I'd just do the mean of each 4 rows instead of re-downloading by hand
Ok, no worries, I was just curious if anyone had used the quantopian pipeline interface. Do we care about the source?
Ethan has code in ethan.ipnb where he draws from quantopian for gold prices. There's also samples on their website I saw. We don't care about source, just whichever is most convenient for the goals of the sub-project I'd say
Ok, I'm going to mess with quantopian a bit more tonight and then hopefully have something pulled together tomorrow.
Does anyone have a simple way to rotate through the list of DJIA csv's and just pull each closing price column named as the ticker? If not, I can just pull it off of a terminal at work but it would probably be better if we could combine the dataset into one CSV like that.
I had to temporarily throw in the towel on figuring out the quantopian system and just pulled pricing data out of my Bloomberg Terminal Access for the sake of us actually having something to present. I uploaded the CSV's which should be easier for you guys to work with one is monthly and one is weekly. There's a snippet of two lines of code in the comments that pulls it into a dataframe (assumes same folder) and drops V and DOW, since V price data stops prior to 2008 and DOW also is incomplete.
These datasets only go back to 2000, I can go back further if need be pretty easily later, I was just trying to get us on the Quantopian platform but its really confusing to me. Right now my goal is to wrap up some of the things we talked about by class. If you guys want, feel free to play with these datasets, Alexander is it possible to rerun your correlation matrix on both of these datasets?
import numpy as np import pandas as pd
data_weekly = pd.read_csv('data_weekly.csv') data_weekly.drop(['V', 'DOW'], axis=1, inplace=True)
data_monthly = pd.read_csv('data_monthly.csv') data_monthly.drop(['V', 'DOW'], axis=1, inplace=True)
I'm working with the optimization, I updated the two CSV files and here's some simple code to work with, log is analogous to returns:
import numpy as np import pandas as pd import datetime as dt
data_weekly = pd.read_csv('data_weekly.csv') data_weekly.drop(['V', 'DOW'], axis=1, inplace=True) data_weekly.Date = pd.to_datetime(data_weekly.Date) data_weekly.set_index('Date', inplace=True)
data_monthly = pd.read_csv('data_monthly.csv') data_monthly.drop(['V', 'DOW'], axis=1, inplace=True) data_monthly.Date = pd.to_datetime(data_monthly.Date) data_monthly.set_index('Date', inplace=True)
log_weekly = np.log(data_weekly/data_weekly.shift(1)) log_weekly = log_weekly.iloc[1:]
log_monthly = np.log(data_monthly/data_monthly.shift(1)) log_monthly = log_monthly.iloc[1:]
print(len(log_weekly.row))
Making progress
That was 5 random stock and this is 30, so I think the next step may be trying to select different combinations of like 5 using some form of methodology.
My new comparison graphs are up! I excluded the DOW stocks. I used the different individual stock CSVs because wrangling my code to work for the aggregate weekly and monthly data would take me a while--I'm going to work on that more tonight and tomorrow, but I figured it's better to post my results sooner rather than later.
Quincy, is the information in the monthly and weekly data the opening or closing price for that month? Wait never mind I see the dates
I pushed a version of the moving window code. It has samples for how the Chi matrix is made, how the Corr matrix is made, and how we can make a list of how market correlations change through time (according to different window sizss)
Get Outlook for iOShttps://aka.ms/o0ukef
From: mbarger1 notifications@github.com Sent: Tuesday, October 22, 2019 10:03:38 PM To: moorea1/DS501_Case2 DS501_Case2@noreply.github.com Cc: Moore, Alexander M. ammoore@wpi.edu; Comment comment@noreply.github.com Subject: [EXT] Re: [moorea1/DS501_Case2] Result Tracking (#2)
Quincy, is the information in the monthly and weekly data the opening or closing price for that month?
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmoorea1%2FDS501_Case2%2Fissues%2F2%3Femail_source%3Dnotifications%26email_token%3DAHQBM2PQLV6BD43M7FMJZI3QP6WHVA5CNFSM4JCLSMIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB7Y26A%23issuecomment-545230200&data=02%7C01%7Cammoore%40wpi.edu%7Ccda349a80cf04f4ba98108d7575d38e7%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637073930214943660&sdata=8PttAJRHdfFaWLS06wExlaIQ9QVPepkzQeMxCh%2FG%2BiA%3D&reserved=0, or unsubscribehttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHQBM2OTBGWON2FTWR7VGPDQP6WHVANCNFSM4JCLSMIA&data=02%7C01%7Cammoore%40wpi.edu%7Ccda349a80cf04f4ba98108d7575d38e7%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637073930214943660&sdata=OFpP%2FwriD3hDA2QpMu3rkdDMD3yE2AWzK8sz5bT3hFc%3D&reserved=0.
New results are up
I put up a rough draft of the power point (if you guys hate the humor I can remove it) I am still going to incorporate Mia's work, clean up some of my plots and I haven't incorporated the latest stuff from Alex yet. I am also going to continue inserting some more changes to the jupyter notebook (the one called case2_Main). I've done a lot of work to the stuff off of the blog at this point, so we can safely call it our own version, (everyone stole it off Markowitz , 1952 anyways). I may ultimately try doing something in Quantopian later but I feel like we have a fair amount of content as it is at this point and are pretty close to having a final product. If anyone wants to take over part of the presenting just speak up and I can help you with any parts you aren't sure on.
I added two brief slides that go over my work in ethanslides.pptx
Is there any sort of data that summarizes the performance from the neural network approach?
I may do a quick alteration on the consistent gain approach by splitting it across a training and test set if that's ok.
Are you guys ok with a quick meeting around noon tomorrow to review the final slides and any changes for like 10-15 minutes before we submit them?
Ethan and Mia, would you prefer all of your slides be included or if some later get dropped in the interest of space and time is that ok?
I've pushed 2 slides with my stuff. If they don't end up making it in there, it's no big deal. And yeah, tomorrow at 12 works!
Ok, well if tomorrow around noon works, I'll plan on meeting whoever can make it, 3rd floor of fuller just to lock down any details.
Ill be there tomorrow at noon
Get Outlook for iOShttps://aka.ms/o0ukef
From: qh2150 notifications@github.com Sent: Wednesday, October 23, 2019 4:55:12 PM To: moorea1/DS501_Case2 DS501_Case2@noreply.github.com Cc: Moore, Alexander M. ammoore@wpi.edu; Comment comment@noreply.github.com Subject: [EXT] Re: [moorea1/DS501_Case2] Result Tracking (#2)
Ok, well if tomorrow around noon works, I'll plan on meeting whoever can make it, 3rd floor of fuller just to lock down any details.
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmoorea1%2FDS501_Case2%2Fissues%2F2%3Femail_source%3Dnotifications%26email_token%3DAHQBM2LUWO4ZM7E4FKWEEX3QQC23BA5CNFSM4JCLSMIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECC267Y%23issuecomment-545632127&data=02%7C01%7Cammoore%40wpi.edu%7Cd6c9d4595bac4f53492c08d757fb4cc7%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637074609162751455&sdata=Rilj06nFTLK5Ig4w4eCwqZne5NvThRkFDByG3hykMwE%3D&reserved=0, or unsubscribehttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHQBM2L2PBHQ4UYN7AEGK73QQC23BANCNFSM4JCLSMIA&data=02%7C01%7Cammoore%40wpi.edu%7Cd6c9d4595bac4f53492c08d757fb4cc7%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637074609162761452&sdata=yunAVK0HE6v9oH3Qj5rIFI%2BKJTt%2BxJ2WQ3%2BeBnf6nEA%3D&reserved=0.
This issue is for us to post some images of our results, ask questions, and give feedback.