STAT545-UBC-hw-2018-19 / hw06-rasiimwe

hw06-rasiimwe created by GitHub Classroom
0 stars 0 forks source link

Peer-Review HW-06 for @rasiimwe #3

Open zeeva85 opened 5 years ago

zeeva85 commented 5 years ago

Peer-Review HW-06 for @rasiimwe

Topic Excellent Satisfactory Needs Work
Coding style :heavy_check_mark:
Coding strategy :heavy_check_mark:
Presentation: graphs :heavy_check_mark:
Presentation: tables :heavy_check_mark:
Achievement, creativity :heavy_check_mark:
Ease of access :heavy_check_mark:

Remarks:

I am very familiar with your work as I have reviewed your fantastic work twice, you have implemented some of the tips I had, not sure if I am going to find anything but here it goes..

Elaborate on above, esp. for "needs work."

The coding style is as usual very presentable, concise and is great job overall. I dont see anything major, maybe a little more consistency with the =. Some places had white spaces some didnt have them. I personally would do with spaces for languages where it doesnt matter like R without totally perfect as long as it follows the same styling I guess. The coding stratergy is very straight forward and well thought out in in terms of work flow of how you are going to tackle a problem. A quick note on the geom_smooth(method = "lm", se = T), see below.

Some specific praise?

I CANNOT BELIEVE!! I did not see of making a ggplot function, as I completely agree with you that the lines for a ggplot is quite a bit of a work out. I specifically loved that fucntion, It is much more useful for quick plots and hats off to you for thinking of that. The work in terms of fucntion is flawless, I am sure if given much more time to explore you would have implemented all my tips on your own accord, Like I said I am very familiar with your impressive work.

WELL DONE !! 👍.

Something I learned? Something I know and that you, my peer, might like to know because it is relevant to something you struggled with.

I never knew about the I funnction in R which is the as it is function, I would think it would be useful in places where a function has a specific behaviour that coerces a data into a another kind of data. Normally we would avoid this by using the as...group-of-functions, but now I am aware that I can also use the the I to retain the type of object. I am aware its predominatly used for fomulas but, I count help myself and see what it does in a simple data frame example. Thank you I did learn alot from peer reviewing your work such a treat !! I guess I am too neerdy :D

df <- data.frame(stringi = I("seeva"))
df1 <- data.frame(stringi = "seeva")

str(df)
str(df1)

<!--tHE OUTPUT FROM CONSOLE-->

> str(df1)
'data.frame':   1 obs. of  1 variable:
 $ stringi: 'AsIs' chr "dog"
> str(df2)
'data.frame':   1 obs. of  1 variable:
 $ stringi: Factor w/ 1 level "dog": 1

Specific constructive criticism?

  1. Looking at the documentation and I realized that the se = T is by default so I dont setting this is neccesary:-
?geom_smooth()

geom_smooth(mapping = NULL, data = NULL, stat = "smooth",
  position = "identity", ..., method = "auto", formula = y ~ x,
  se = TRUE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
  1. The second fucntion is sort of incomplete in terms of reproducibility, I would have to install the quantreg package for the function to work :-
Function arguments
df         the dataset
y          y-axis variable
x          x-axis variable
xlab       x-axis label
ylab       y-axis label
title      plot title

plot_model_fit <- function(df, y, x, xlab, ylab, title){

  liner_model_fit <- lm(y ~ x, data=df) 
  coef(liner_model_fit)
  quantile_regression_fit <- rq(y ~ x, data=df)
  least_squares_fit <- rlm(y ~ x, data=df)

  plot <- df %>% 
    ggplot(aes(x, y)) +                                 
    geom_point()+                                                   
    geom_smooth(method="lm", aes(colour="liner_model_fit"), se=TRUE) +        
    geom_smooth(method="rq", aes(colour="quantile_regression_fit"), se=FALSE) +           
    geom_smooth(method="rlm", aes(colour="least_squares_fit"), se=FALSE) +        
    labs(colour=NULL) +         
    theme_bw() +
    ggtitle(title) +
    theme( plot.title = element_text(hjust = 0.5))+
    labs(x=xlab, y=ylab)
  return(plot)
}

xlab <- "year"
ylab <- "gdpPercap"
title <- "Function Derived Plot - GDP Per Capita of Kuwait over the Years"
plot_model_fit(data, data$gdpPercap, data$year, xlab, ylab, title)

I think for the function its amazing you would have put so much thought into it, I think you could include just a line to install the package required for the function. Also If I remembered right from StackOverflow, there seems to be a dislike in using the library() in a function as its more prone to failure and hence require() is prefered as it ignores warnings (I think). From reading those forums, I also remembered that they would prefer to call the dependencies via its namespace rather than loading such as this.

This 

quantile_regression_fit <- quantreg::rq(y ~ x, data=df)

Instead of this 

quantile_regression_fit <- rq(y ~ x, data=df)
  1. Some thoughts on improving your ggplot function, leave in some default values so the user wouldn't have to key them in. This is because running the function without some that is not crucial such as title would result in an error or not providing TRUE for factors or jitter . This is a very simple to do. I've condensed your function to show how this can be done:-
You did this.

myplot <- function(x, type, x_variable, y_variable, factor1, jitter, order, xlab, ylab, title, plot2)

I believe it can be defaulted like this.

myplot <- function(x, type, x_variable, y_variable, factor1 = TRUE, jitter = FALSE, order, xlab, ylab, title = "", plot2)

If you want FALSE, you just key in FALSE in the order you wrote it.

  1. When you load your library withouy suppressing messages you would typically see the message below:-
library(tidyverse)

── Attaching packages ──────────────────────────────────── tidyverse 1.2.1 ──
✔ ggplot2 3.1.0     ✔ purrr   0.2.5
✔ tibble  1.4.2     ✔ dplyr   0.7.8
✔ tidyr   0.8.2     ✔ stringr 1.3.1
✔ readr   1.1.1     ✔ forcats 0.3.0
── Conflicts ─────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()

<!--You could have gotten away with this-->

suppressPackageStartupMessages(library(gapminder)) #loading the Gapminder excerpt
suppressPackageStartupMessages(library(quantreg))  #quantile regression
suppressPackageStartupMessages(library(MASS)) #Functions and datasets to support Venables and Ripley
suppressPackageStartupMessages(library(tidyverse)) #provides system of packages for data manipulation
suppressPackageStartupMessages(library(knitr))
suppressPackageStartupMessages(library(plotly))#for renderin interactive plots
suppressPackageStartupMessages(library(RColorBrewer))#provide color palette
suppressPackageStartupMessages(library(repurrrsive))
suppressPackageStartupMessages(library(tidytext))

or

not sure if this actually works maybe try it

PKG <- c("gapminder","quantreg", "MASS", "tidyverse", "knitr", "RColorBrewer", "repurrrsive", "tidytext")
lapply (PHG, library, character.only = T, quietly =T)

I am not sure if you know this but just incase, by loading the tidyverse package you attach the aforementioned attaching packages list, that includes "ggplot2", "dplyr", "tibble", "purrr", "stringr". You could have gotten away with 5 extra lines lesser when you use the use library(tidyverse).

"Tidyverse" package is a suite of packages from Hadley that was precompiled on frequently used packages for data manipulation and visualization. It makes it easier to just load the Tidyverse suite rather than the individual packages. When you load them separately like with your rmd I believe, your just reloading them. The only reason you would want to load them seprately from the top of my head is I guess if you have different versions that you would need for some reason maybe the old version had a function do someting useful for you, like a side behaviour, that was fixed or remove in the newer updates.

If you ever want to do that, you have to install.packages() with additional arguments specifying them to install on their non default location, as installing the packages using the default argument updates the "Tidyverse attaching" version too. Note that some packages the tidyverse suite is one version behind for instance ggplot, tidyr and dplyr. Try packageVersion() and check tidyverse attaching message.

library(devtools)
temp <- "~path/temp"
    dir.create(temp)
    with_lib(temp,
             install_github("tidyverse/ggplot2",
                            build_vignettes = TRUE))
    library(ggplot2, lib.loc = temp)
    packageVersion(ggplot2)

Kudos🥇, well done and Thank you @rasiimwe

Seevasant Indran

rasiimwe commented 5 years ago

This is coming in super late but thanks prof @zeeva85 for reviewing my work 😄. Thanks for taking the time for such a comprehensive review 👏 👏 I took note of your helpful comments. You deserve 1000% on your peer review!! 😄 Thanks Have a great weekend!

zeeva85 commented 5 years ago

😄 Thank you, hope you found those useful.

Well, doing a peer review on your hw or anyones hw, I always end up learning more than I could ever do in that one hour if i spent on my own homework, I never realized that before this class but now I do, because its an example on its own so its like looking a different cheatsheet.

Because in R there is so many ways to solve a problem, unlike in python where I believe the philosophy and they way the language is written is to keep it solvable one way or at least the least amount of way and with a much more readable syntax than R (don't quote me on this read it somewhere along the years).

So likewise thank you 🙏