shannonpileggi / SR--Pang--BU

0 stars 0 forks source link

Summer Research, 2017

Student: Corey Pang

Faculty advisor: Shannon Pileggi

Objective

Utilize dimension reduction techniques to analyze data related to Buruli ulcer.

Specific Aims

  1. Utilize GitHub to collaborate on project materials and updates.
  2. Write all R code according to Hadley Wickam's Style Guide (http://adv-r.had.co.nz/Style.html). All R code should be written in a reproducible manner, such that code will execute when applied to a new data set.
  3. Literature review
    • Read Merritt et al 2005, Merritt et al 2010 and Pileggi et al 2017 for background information on Buruli ulcer.
    • Read Wagner, Cambpell, and Wu for other examples of of landscape and spatial analysis related to Buruli ulcer.
    • Summarize techniques and approaches to analysis Buruli ulcer, with special attention to any dimension reduction techniques like PCA. Identify other similar papers.
  4. Take the "Unsupervised Learning in R" course through DataCamp (https://www.datacamp.com/courses/unsupervised-learning-in-r) and the "Introduction to Machine Learning" course through DataCamp (https://www.datacamp.com/courses/introduction-to-machine-learning-with-r). Recommended Data Camp course for enhancing R programming: "Writing Functions in R" (https://www.datacamp.com/courses/writing-functions-in-r)
  5. Other sites that might be useful for reading about PCA include:
  6. Investigate the ggbiplot package to enhance visualization of PCR results (https://github.com/vqv/ggbiplot).
  7. Apply PCA or other appropriate dimensionality reduction techniques to two data sets regarding Buruli ulcer.
    • Point level data: This data set contains information on 98 water sites in Ghana tested for Mycobacterium ulcerans (corresponds to the Pileggi 2017 paper). Utilize PCA on the water variables and land cover land use variables.
    • Regional data: This is data summarized at the regional level for the cases of Buruli ulcer. Utilize PCA on the land use/land cover variables.
  8. Utilize the results of PCA to create models for association or prediction of Buruli ulcer cases or MU positive.
  9. Provide a through write up / discussion of results.
  10. Take initiative at any point to recommend other directions for analysis.