DS4PS / cpp-528-spr-2020

Course shell for CPP 528 Foundations of Data Science III for Spring 2020.
http://ds4ps.org/cpp-528-spr-2020/
2 stars 0 forks source link

How do I source functions in R? #19

Open cenuno opened 4 years ago

cenuno commented 4 years ago

What does it mean to separate R logic in another file?

cenuno commented 4 years ago

I'm going to refer to the code used in this reference:

fahrenheit_to_celsius <- function(temp_F) { 
    temp_C <- (temp_F - 32) * 5 / 9 
    return(temp_C)
}

Let's assume you've stored that function in a file called analysis/functions/utilities.R.

That separation of logic from future analyses is helpful because you've given a dedicated R file to host your custom functions which gives three benefits:

  1. you've allowed others to see how the function was created if they are interested in seeing that information without providing it in your analysis
  2. you can use functions stored in analysis/functions/utilities.R at anytime by using the source() function. The source() function will evaluate all R logic in the given file path and return all objects to your Global Environment.
  3. you can easily share logic with teammates! You no longer need to share an entire .Rmd (which is usually filled with narrative and data visualization) in order to share custom functions. You can now pass a .R file to someone that immediately gives them enough context to use the functions in that file.

Now back in your RMD file, you should create a new chunk that sources your analysis/functions/utilities.R file so that you can use fahrenheit_to_celsius() function.

# load necessary functions
source(here("analysis/functions/utilities.R"))

Now the function fahrenheit_to_celsius() is available in your Global Environment (check out the top right hand panel in RStudio for further evidence).

You (the user) can now supply the necessary input (temp_F) that the fahrenheit_to_celsius() function requires.

lecy commented 4 years ago

You can also use these steps to customize your R environment at start-up.

There is a user profile file that runs on start-up. You can add specific scripts to automate things that you find yourself doing in every script, such as loading dplyr or reading in a template for graphics like pairs().

https://stat.ethz.ch/R-manual/R-devel/library/base/html/Startup.html

The only problem with this approach is that it is specific to a machine since the file is stored in the folder where R is installed. Members of a team would not be able to use this, for example.

Alternatively you can create an R startup file on GitHub and add all of your desired default settings:

https://github.com/DS4PS/sourcer-r/blob/master/sourcer.R

Then source the raw version of this file at the beginning of your scripts:

# source means read and execute R script
source( "https://raw.githubusercontent.com/DS4PS/sourcer-r/master/sourcer.R" )

One custom function I have found especially useful is something to customize the basic plot function by using your favorite default aesthetics:

jplot <- function( x1, x2, lab1="", lab2="", draw.line=T, ... )
{

    plot( x1, x2,
          pch=19, 
          col=gray( 0.6, alpha = 0.2 ), 
          cex=3.5,  
          bty = "n",
          xlab=lab1, 
          ylab=lab2, 
              cex.lab=1.5,
        ... )

    if( draw.line==T ){ 
        ok <- is.finite(x1) & is.finite(x2)
        lines( lowess(x2[ok]~x1[ok]), col="red", lwd=3 ) }

}

The three dots means to include any additional arguments that user might include, which allows them to pass any additional plotting parameters that are not explicitly mentioned in the function.

I would keep your start-up script and any project-based scripts separately. The utililies.R file should store functions that are specific to a project, such as the functions that help you process the census meta-data on Lab 03.

sunaynagoel commented 4 years ago

Thank you @cenuno and @lecy, this answers one of my question.

castower commented 4 years ago

@cenuno would this also be the strategy to run the correlation plots from Lab-02. I've been reviewing feedback from that lab on ways to improve my RMD file and I noticed that it was suggested that I store the function code in a separate R file.

cenuno commented 4 years ago

Yes that is correct 👍🏽

— Cristian E. Nuno


From: Courtney notifications@github.com Sent: Thursday, April 9, 2020 2:40:50 AM To: DS4PS/cpp-528-spr-2020 cpp-528-spr-2020@noreply.github.com Cc: Cristian Ernesto Nuno cenuno@syr.edu; Mention mention@noreply.github.com Subject: Re: [DS4PS/cpp-528-spr-2020] How do I source functions in R? (#19)

@cenunohttps://github.com/cenuno would this also be the strategy to run the correlation plots from Lab-02. I've been reviewing feedback from that lab on ways to improve my RMD file and I noticed that it was suggested that I store the function code in a separate R file.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/DS4PS/cpp-528-spr-2020/issues/19#issuecomment-611434652, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFZB2SYB4NISKSMODMJLQFDRLWJ2FANCNFSM4MEEMWTQ.