The objective of this summer research is to create an R package that can execute parametric and non-parametric survival analysis techniques similar to those in Minitab.
A GitHub repository which contains the following:
An R package for the survival functions.
A log of hours spent on the summer research by each student, which includes date, hours, and activity summary.
A presentation to the Statistics Department.
A presentation at the CSM annual research conference.
A manuscript to submit to the R Journal detailing the work and submit an abstract to the RStudio Conference.
1. Utilize GitHub to collaborate on project materials and updates.
Karl Broman's github tutorial
Jenny Bryan's Happy git with R.
DataCamp's Introduction to git for data science course (good for learning command line, not necessary if using the RStudio IDE).
Also check out using version control with RStudio and this video on Git and RStudio.
2. Adhere to good programming practices.
Write all R code according to Hadley Wickam's Style Guide.
Use the tidyverse style guide for an additional reference.
Learn about how to write R functions from DataCamp's Writing functions in R course.
Use Hadley Wickham's R for Data Science book as a reference (Ch19 also discusses functions).
3. Create an R package that contains survival functions. At a minimum, this should be downloadable through devtools; as time allows, consider putting it on CRAN.
DataCamp's blog post R Packages: A Beginner's Guide
Hillary Parker's blog post Writing an R package from scratch
Hadley Wickham's R pacakges book.
RStudio's video on Package writing in RStudio.
4. Provide documentation for the R package.
Use the roxygen package to document code.
Write a vignette to accompany the package.
Consider using pkgdown to create a website.
5. Review existing R packages for survival analysis.
A comprehensive list: CRAN Task View: Survival Analysis.
Pay attention to: survival, fitdistrplus, flexsurv, and survminer.
6. Create functions for items that are not currently easy to achieve in R. (Make sure that these cannot be accomplished in the existing R packages). Pay attention to the parameterization of a distribution, which is often different between Minitab and R.
Parametric:
Lab 2:
Lab 3:
Non-parametric:
Lab 4:
Lab 5:
7. Consider also converting some of the functions to a Shiny App.
8. Other potentially useful DataCamp courses.