Open mbarger1 opened 4 years ago
If I project onto 3 dimensions I get way more interesting results. The two manifolds on the right are 3D but the two on the left are 2D. The top left is really made of outliers (you can see some details in the screenshot) but the one on the far left has a more normal distribution. Haven't figured out why it's 2D though.
Those are extremely cool visualizations what did you make them with? What is the goal of clustering + what would a cluster mean? Did you try some regression models on the reduced dimension data? If not, point me to where I can jump into your code to model on the data?
I used Kepler Mapper to make them. It projects the data points down to a specified dimension (in my code, it's dimension 3), then clusters them with a specified algorithm (I think right now it's using DBScan or KNN, I don't remember off the top of my head). Then it takes the clusters and constructs a simplicial complex around them. It's basically a scaffold showing you an approximation of the manifold they lie on. The clusters are just showing you which data points are most similar. I haven't gotten to predictions yet, but I should be able to work it in tomorrow--the clusters are made with training data right now, so I'll be able to fiddle with them for predictions in a bit.
I haven't done any regression on the projected data yet, so feel free to go for it! In my jupyter notebook, there's a part where it says "projected_data = mapper.fit_transform(train, projection=[0,1,2])" The "projection=[0, 1, 2]" part is where I specify to project onto the first 3 dimensions. You should be able to run some regression models on that. Feel free to tweak the dimension it gets projected to, as well. I haven't gotten to that part yet.
Ignore the comments. Initially I took the code from the Kepler Mapper library, but I've been tweaking things and totally forgot to change the comments. I'll fix that in the next version I push.
Ok thanks I'll check it out. I'm doing some feature engineering stuff now, Does anyone have any ideas how to encode that the first 11 rows are Offense and the second 11 rows are Defense, and the direction play is moving? The only thing I really see is 'PlayDirection'
That's average of yards gained durring a play when a person was standing in that position relative to the runner.
The code for that is in version_3, feel free to use
^ love that. Ethan I'm doing that preprocessing thing now to turn 'X', 'Y', 'S', 'A', 'Dis', 'Orientation', 'Dir', 'PlayerWeight' into a useable single feature, hopefully
OK OK OK so with a better version of that picture from before, and some finagling, I was able to create that data from 75% of the data and then use the remaining 25% to predict yards and it performed WAY better than the neural network.
Neural network: MSE: 55.0086
My way: MSE: 40.7631
Next Steps: see if adding this into the neural net makes it better and then try doing this for the other kinetic features in a more fancy way
Can we meet and talk after 548? I wanted to talk about my preprocessing thing, not sure how to handle some stuff
Get Outlook for iOShttps://aka.ms/o0ukef
From: Ethan Prihar notifications@github.com Sent: Monday, November 18, 2019 4:01 PM To: moorea1/DS502_Final Cc: Moore, Alexander M.; Comment Subject: [EXT] Re: [moorea1/DS502_Final] Results (#6)
OK OK OK so with a better version of that picture from before, and some finagling, I was able to create that data from 75% of the data and then use the remaining 25% to predict yards and it performed WAY better than the neural network.
Neural network: MSE: 55.0086
My way: MSE: 40.7631
Next Steps: see if adding this into the neural net makes it better and then try doing this for the other kinetic features in a more fancy way
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmoorea1%2FDS502_Final%2Fissues%2F6%3Femail_source%3Dnotifications%26email_token%3DAHQBM2P77AASECOHBQPFTLDQUL7AVA5CNFSM4JOHDFL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEL4SIY%23issuecomment-555206947&data=02%7C01%7Cammoore%40wpi.edu%7Cc0159b4d73ac4937987f08d76c6a69d1%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637097076620411725&sdata=YYNhmulSxrgFrnqhpHd%2FdlLgQFUKsxGOMcrXcx1ZAPQ%3D&reserved=0, or unsubscribehttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHQBM2OLT4LV56OSTAVWH2TQUL7AVANCNFSM4JOHDFLQ&data=02%7C01%7Cammoore%40wpi.edu%7Cc0159b4d73ac4937987f08d76c6a69d1%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637097076620421719&sdata=fcU38Tf2dn7rHvDWDoYaVFd3LIrjAof%2BimB%2BBr%2Byios%3D&reserved=0.
Made a Google doc for the report. I sent links to the emails everyone's using for the slides, but I'll throw the link here too. Let me know if there are any issues. https://docs.google.com/document/d/1CMgzXzDBJv7rrKRCih6o1P69S3KUbxROV17KlEtW_1E/edit?usp=sharing
Once we have it written up in the doc, I can transfer it over to a Latex document if we want to use pretty formatting and include math in it.
Is anybody interested in meeting today around noon or whenever to review any final changes to the PPT deck?
I can meet at 12 in the library
That works for me
On Thu, Nov 21, 2019 at 10:48 AM mbarger1 notifications@github.com wrote:
I can meet at 12 in the library
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/moorea1/DS502_Final/issues/6?email_source=notifications&email_token=AM6TJNQJFTSIUX27UUF4LOTQU2UU3A5CNFSM4JOHDFL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE2VV5A#issuecomment-557144820, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM6TJNQTFMJCRLOOKKTBRSTQU2UU3ANCNFSM4JOHDFLQ .
-- Q. Benjamin Hershey (717) 580-7169
Hey Mia, Alex said he can't make it and I don't know if Ethan can either. Lets still meet just quickly for like five minutes in library at noon to lock down any details and get a submission package in on canvas if that works for you.
Sounds good. I'm just hanging out by the help desk until 1ish anyway.
501 is submitted
Thanks Mia!
Does anyone feel strongly about including/not including the veroni diagram slides in this 501 presentation? (Rather than saving for 502)
Ive been second guessing myself all day :shrug:
Get Outlook for iOShttps://aka.ms/o0ukef
From: qh2150 notifications@github.com Sent: Thursday, November 21, 2019 2:37:52 PM To: moorea1/DS502_Final DS502_Final@noreply.github.com Cc: Moore, Alexander M. ammoore@wpi.edu; Comment comment@noreply.github.com Subject: [EXT] Re: [moorea1/DS502_Final] Results (#6)
Thanks Mia!
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmoorea1%2FDS502_Final%2Fissues%2F6%3Femail_source%3Dnotifications%26email_token%3DAHQBM2OI6EPA3NGGBQWU4S3QU3PRBA5CNFSM4JOHDFL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE3M4NI%23issuecomment-557239861&data=02%7C01%7Cammoore%40wpi.edu%7C5a47fc9ff26b412ed69408d76eba4cfe%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637099618741359046&sdata=7r6%2Bmjeb%2Bdh4PUI4bfwBbYQWozuYFJJc7ehix7yoAIg%3D&reserved=0, or unsubscribehttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHQBM2OVYOYQ6S4FSY6IXRLQU3PRBANCNFSM4JOHDFLQ&data=02%7C01%7Cammoore%40wpi.edu%7C5a47fc9ff26b412ed69408d76eba4cfe%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637099618741369039&sdata=HaiBOK4%2FbN9LAMTAMKYamcDPdddKQ7wCf%2FiRZs9M3UM%3D&reserved=0.
Haha, I'm fine with either one. I don't think it will jeopardize our grade leaving it out so its not a big deal either way. I do think it makes a more entertaining presentation with it in though.
On Thu, Nov 21, 2019 at 2:59 PM Alexander Moore notifications@github.com wrote:
Does anyone feel strongly about including/not including the veroni diagram slides in this 501 presentation? (Rather than saving for 502)
Ive been second guessing myself all day :shrug:
Get Outlook for iOShttps://aka.ms/o0ukef
From: qh2150 notifications@github.com Sent: Thursday, November 21, 2019 2:37:52 PM To: moorea1/DS502_Final DS502_Final@noreply.github.com Cc: Moore, Alexander M. ammoore@wpi.edu; Comment < comment@noreply.github.com> Subject: [EXT] Re: [moorea1/DS502_Final] Results (#6)
Thanks Mia!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub< https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmoorea1%2FDS502_Final%2Fissues%2F6%3Femail_source%3Dnotifications%26email_token%3DAHQBM2OI6EPA3NGGBQWU4S3QU3PRBA5CNFSM4JOHDFL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE3M4NI%23issuecomment-557239861&data=02%7C01%7Cammoore%40wpi.edu%7C5a47fc9ff26b412ed69408d76eba4cfe%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637099618741359046&sdata=7r6%2Bmjeb%2Bdh4PUI4bfwBbYQWozuYFJJc7ehix7yoAIg%3D&reserved=0>, or unsubscribe< https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHQBM2OVYOYQ6S4FSY6IXRLQU3PRBANCNFSM4JOHDFLQ&data=02%7C01%7Cammoore%40wpi.edu%7C5a47fc9ff26b412ed69408d76eba4cfe%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637099618741369039&sdata=HaiBOK4%2FbN9LAMTAMKYamcDPdddKQ7wCf%2FiRZs9M3UM%3D&reserved=0
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/moorea1/DS502_Final/issues/6?email_source=notifications&email_token=AM6TJNRE6PY45RFSWONYCGTQU3SAZA5CNFSM4JOHDFL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE3O43Q#issuecomment-557248110, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM6TJNUKBFLYETDKVJMYHLTQU3SAZANCNFSM4JOHDFLQ .
-- Q. Benjamin Hershey (717) 580-7169
I think presenting it seems like a more fun presentation, but I also pulled an all nighter so not having to say anything during the presentation is also very attractive to me, haha. Win, win
On Thu, Nov 21, 2019 at 3:21 PM Ben Hershey q.benjamin@gmail.com wrote:
Haha, I'm fine with either one. I don't think it will jeopardize our grade leaving it out so its not a big deal either way. I do think it makes a more entertaining presentation with it in though.
On Thu, Nov 21, 2019 at 2:59 PM Alexander Moore notifications@github.com wrote:
Does anyone feel strongly about including/not including the veroni diagram slides in this 501 presentation? (Rather than saving for 502)
Ive been second guessing myself all day :shrug:
Get Outlook for iOShttps://aka.ms/o0ukef
From: qh2150 notifications@github.com Sent: Thursday, November 21, 2019 2:37:52 PM To: moorea1/DS502_Final DS502_Final@noreply.github.com Cc: Moore, Alexander M. ammoore@wpi.edu; Comment < comment@noreply.github.com> Subject: [EXT] Re: [moorea1/DS502_Final] Results (#6)
Thanks Mia!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub< https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmoorea1%2FDS502_Final%2Fissues%2F6%3Femail_source%3Dnotifications%26email_token%3DAHQBM2OI6EPA3NGGBQWU4S3QU3PRBA5CNFSM4JOHDFL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE3M4NI%23issuecomment-557239861&data=02%7C01%7Cammoore%40wpi.edu%7C5a47fc9ff26b412ed69408d76eba4cfe%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637099618741359046&sdata=7r6%2Bmjeb%2Bdh4PUI4bfwBbYQWozuYFJJc7ehix7yoAIg%3D&reserved=0>, or unsubscribe< https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHQBM2OVYOYQ6S4FSY6IXRLQU3PRBANCNFSM4JOHDFLQ&data=02%7C01%7Cammoore%40wpi.edu%7C5a47fc9ff26b412ed69408d76eba4cfe%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637099618741369039&sdata=HaiBOK4%2FbN9LAMTAMKYamcDPdddKQ7wCf%2FiRZs9M3UM%3D&reserved=0
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/moorea1/DS502_Final/issues/6?email_source=notifications&email_token=AM6TJNRE6PY45RFSWONYCGTQU3SAZA5CNFSM4JOHDFL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE3O43Q#issuecomment-557248110, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM6TJNUKBFLYETDKVJMYHLTQU3SAZANCNFSM4JOHDFLQ .
-- Q. Benjamin Hershey (717) 580-7169
-- Q. Benjamin Hershey (717) 580-7169
No strong feelings. The one we submitted has your Voronoi diagram, Alex, but not Quincy's. I think we have enough of a presentation for tonight with all that. Then we can go into depth with Quincy's diagrams on Tuesday.
My veroni diagram? There wasn’t one when we made presentation last evening?
Get Outlook for iOShttps://aka.ms/o0ukef
From: mbarger1 notifications@github.com Sent: Thursday, November 21, 2019 3:39:06 PM To: moorea1/DS502_Final DS502_Final@noreply.github.com Cc: Moore, Alexander M. ammoore@wpi.edu; Comment comment@noreply.github.com Subject: [EXT] Re: [moorea1/DS502_Final] Results (#6)
No strong feelings. The one we submitted has your Voronoi diagram, Alex, but not Quincy's. I think we have enough of a presentation for tonight with all that. Then we can go into depth with Quincy's diagrams on Tuesday.
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmoorea1%2FDS502_Final%2Fissues%2F6%3Femail_source%3Dnotifications%26email_token%3DAHQBM2JA5FU766WUDLYORMLQU3WWVA5CNFSM4JOHDFL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE3SMAQ%23issuecomment-557262338&data=02%7C01%7Cammoore%40wpi.edu%7C93a74a347bea4be5fa2b08d76ec2db82%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637099655500106896&sdata=yWm7hxYuLtVvQyq0EBINKuGHGD%2F%2FKPuH1%2BB4NG1JmI8%3D&reserved=0, or unsubscribehttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHQBM2IHQEQFBEGDH5FIVPLQU3WWVANCNFSM4JOHDFLQ&data=02%7C01%7Cammoore%40wpi.edu%7C93a74a347bea4be5fa2b08d76ec2db82%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637099655500116894&sdata=Vpm955z8eKJSBOTDks7LCliuvAzXhhrAywZdN80jtmg%3D&reserved=0.
You know what, I'm wrong on all accounts. The Voronoi diagram I was thinking of was the one from here https://www.kaggle.com/statsbymichaellopez/nfl-tracking-wrangling-voronoi-and-sonars and we took that out. I'm just sleep deprived and thinking of an older version.
Guys I got my TDA working on the nn_input data Ethan put together!!! I'm going to need more time to interpret it but I've got some lovely clustering in my simplicial complex going on right now. Check it out.