Summary

In this project, the Andrew chose to analyse a number of things about mobile games. In part one, he featured some visualizations demonstrating the presence of varying in-app purchase prices, along with the relation between a game's rating to their size in bytes. In the second part, he focused more on analyzing the correlation between a game's review score to its content rating. This became the main focus of the project. There he found that there was a very small correlation.

Data Preparation

Andrew tidied up the data received from his datasets. Most of them have a plethora of data on a game's price, its in-app prices, its description, its developer, among other things. Without a doubt the information is tidy and well organized. It has also been thoroughly cleaned.

Modeling

In the models, he tested the correlation between a game's review score and its age rating with the dependent variable being the age rating. There, he found that there was a low correlation between the two, but described as to why that may be very well. My only criticism is that it was somewhat hard to understand what was being visualized in the model's graph. Perhaps a description on what it is showing could help, along with an explanation as to why it is important.

Validation

The project uses testing and training sets to create its model, but I cannot see any instances of cross-validation.

R Proficiency

I believe he demonstrated his understanding of R well enough. He had made a lot of revisions that cleaned up how his R code was printed back in deliverable 3. In addition, the code is commented very well, which aids in understanding.

Communication

He has communicated what he does within his R code very well, however his graphs are somewhat confusing. There doesn't seem to be a strong focus on what he wants to find with his data, and as a result there seems to be numerous graphs illustrating a variety of separate observations. I believe this focus on multiple observations is fine, but there needs to be more descriptions on each graph.

Critical Thinking

Andrew has demonstrated an incredibly strong sense of critical thinking. Any potential issues he thinks there could be have been well discussed. In addition, he has a strong discussion on his project's operationalization and unintended consequences. He also gave a good description on what he could do further with his project.

(note: will sometimes reference past reviews like https://github.com/anguyen62/anguyen62.github.io/issues/4)

Data Preparation and Modeling (19% out of 20%)

Data and tables were tidied and prepared quite nicely according to peer and personal review. Modeling skills could use work especially in the visualization / explanation department, but otherwise, I think I have the meat of modeling done nicely.

Validation and Operationalization (15% out of 20%)

I believe I covered Operationalization as a topic sufficiently, though I could provide actual cross-validation instances as Aaron said.

R Proficiency (20% out of 20%)

I believe my R coding is effective, standard, and sufficient, and past + present peer review + past revisions + code commenting also support this.

Communication (16% out of 20%)

My visualizations could use work; I should work in explaining what exactly readers are looking at, and a description of what are the intended visual messages of my findings. Graphing overall is an ongoing refining project for me. Otherwise, I believe I explained my intentions and overall portfolio sufficiently.

Critical Thinking (20% out of 20%)

I believe my critical thinking and explanation in regards to my results and the topic exceeds the average. Present and past peer review also support this.

introdsci / DataScience-anguyen62

Final Review #3