Summary

This project investigated the emissions put out into the atmosphere, and discovered correlations between the different types of emissions. It also plans out what should be done next, and the ethical implications involved in policy change that affects large companies. The next step would be to give people more education on the topic and keep creating newer "green" technology. It also considers the side of those who would not be in favor of these policy changes.

Data Preparation

Data tables have been created for data on greenhouse gas emissions by sector, and other emission rates. The portfolio has tidy and clean data, and is organized.

Modeling

A predictive model for energy emissions is made using dependent variable such as Waste, Agricultural, Industrial. The purpose of the model is explained, but there is no interpretation of the results for the model in p3. run summary(train_model) then interpret the results.

Validation

The model was created properly using cross-validation techniques. It does not explain why it is being used, and could be added to help show why we use createDataPartition while building a model.

R Proficiency

Appropriate techniques are used throughout the project, and all of the code is reusable and relevant. The only issue is sometimes it would be more appropriate to describe the code beforehand, instead of a comment inside. It's not much of a problem, it just makes it a little easier to read.

Communication

The communication is pretty good throughout, and it was easy to understand what was going on. The graphs were also easy to read, and in a clean format. The only think I could say negatively is that there should be a conclusion in deliverable1 to recap what was done and what you plan on doing in the next phase.

Critical Thinking

Yes, it does. One other consequence not discussed could be a possible change of who people vote for, based on seeing new data about the environment. They could decide that it is more important to them than other factors, and switch who they would normally vote for.

Data Preparation and Modeling (15% out of 20%)

The data is clean and tidy and visuals are accurate for each dataset. The weakest point of my portfolio I would say is the modeling portion. This was the most difficult for me to implement since I didn't have a clear goal in the beginning. Generally I didn't have a clear goal in my opinion. This is also the portion that I have the least amount of understanding in.

Validation and Operationalization (19% out of 20%)

Operationalization was well understood and how to carry on the project, however my understanding of how to implement the validation portion of this project was a bit foggy. A better understanding of validation would have greatly improved this project overall.

R Proficiency (20% out of 20%)

R was implemented with good techniques, no loops were implemented.

Communication (20% out of 20%)

Communicating the data was the portion that I understood the most, even when the mathematical portion wasn't very strong I was still able to communicate any errors made.

Critical Thinking (20% out of 20%)

Applying my findings to real world events or scenarios was something that came easiest to me. My favorite part of the project was the social implications and operationalization. Bringing the data into the real world is what made the most sense to me even when calculations are off.

introdsci / DataScience-asheelamagwili

Final Review #8