Summary

Kevin's project investigated a subject very dear to me and countless others on the planet. Video games and there sales. He looked at what actually would cause a game to sell better than others, whether it was the rating, number of reviewers, critic and user scores, and genre, to name a few. He found that there was an interesting correlation between a video game selling and the amount of reviews that were made about them, rather than their actual reviews (this was very interesting to see). He states that the way to operationalize this data is to take it into the field and cross-check against current selling games to find correlations.

Data Preparation

The tables that were developed are: User Ratings, Critic Ratings, and Date. I unfortunately cannot see them. It does look like some stuff has been done to the tables to make them tidy however. The NA's are all accounted for and un-needed data is trimmed. I think pushing the data to the repository will definitely clean it up and make it easier to see, though!

Modeling

The model built in this project Global Sales from the user reviews, critic reviews, number of users, and number of critics. The model is very informative and shows exactly what I was curious about. It definitely interprets the model summary with the color-coordination! Kevin also explains how the model is built in detail right before he creates it. I am going to try this same stuff on my future models.

Validation

The model has been cross-validated using testing and training sets. I think that there could be a little bit more of an introduction to what was being cross-validated so that the user can better understand what is being looked at and why so early on in the project. Other than that, though it looks good! You might want to set a seed, though, I didn't see if you did. I might have missed it.

R Proficiency

The code looks very good! I am an R novice myself, so I don't know all the ins and outs, thus I look at the reasoning behind the code to understand. I think that maybe a little bit more before each chunk to state what's being put into it would make it easier for someone like me to understand. That being said, the code looks very good to me! I saw a left join in there, so it is serious! One thing, though, to get the third deliverable to knit, you can re-name the chunks so that they aren't the same name. (I was having this issue too)

Communication

The portfolio is well-described, I didn't have an issue understanding it. There are a few areas where spelling mistakes made words look different, but it didn't detract from the understanding. I couldn't see most of the visualizations due to the deliverables not being up on the website. Other than that, though, it was a very good read!

Critical Thinking

There is definite critical thought in this project, especially surrounding the model in why the amount of reviews of a game would implicate higher sales (more popular game, I hadn't thought of that!), the operationalization was a little short, I saw a future for the project, but is there more? It looks really good! Maybe it can be used in helping a company to predict whether or not they want to put more money into a popular game being made due to the idea that it will sell well?

Data Preparation and Modeling (18 out of 20%)

Overall I think I did a good job with the data collection and modeling for the portfolio each data-set seem easy to read and I left enough detail about what each variable mean and stand for overall. The modeling section is not bad I feel like I have a fair understandable model due to the clean data but they can be a little more depth in the model I feel so a few point lost there is understandable.

Validation and Operationalization (15 out of 20%)

I agree with the issue elevation that was given overall I could have went in-depth and try to create a process and explain each process rather then went straight to the final result. This way newcomer can get a understanding in how I got my answer but I think the final result is still fairly good.

R Proficiency (13 out of 20%)

I graded myself harshly on this is due to having buggy error in the end when compiling the final result even though the final overall code is good, compiling the final code into html is important and also a bit of clean would have help but a lot of the code that didn't work wont break they overall program.

Communication (18 out of 20%)

I got a really positive review in communication so overall I thin it good that everything was understandable and people can keep track of what is happening. I well acknowledge grammar and spelling is not the cleanest and more time can be put into it.

Critical Thinking (** of 20%)

The critical thinking was also positively receive and a lot of though was put into it and I think the suggestion that this kind of data can be used by company is not a bad idea but that also now bring with it it;s own ethic dilemma

introdsci / DataScience-KevinC127

Final Review #6