cssearcy / AYS-R-Coding-SPR-2020

Coding in R for Policy Analytics
https://cssearcy.github.io/AYS-R-Coding-SPR-2020/
3 stars 3 forks source link

Different result on R Console vs. when Knit lab 01 #35

Open eneleven11 opened 3 years ago

eneleven11 commented 3 years ago

After doing lab 01, I was wondering if anyone else had the same issue as me in that a result showed up as a specific number in the console and when knit the answer was different

I was about to post a screenshot of the occurrence, but when knitting it now, there are large chunks of blank text.

Was wondering if anyone has advice in this phenomenon.

jamisoncrawford commented 3 years ago

Hi @eneleven11 - this makes sense. The objects that contain data in your Global Environment may have the same name as objects in your R Markdown script, but they can be totally different.

For example, say I've run the following expression and assigned the value 5 to object x in the Global Environment (just chilling in RStudio nbd):

x <- 5

Alright, now if we were to print it in RStudio:

print(x)

That just comes out to 5.

Now, if we put this in our R Markdown document:

x <- 5
x <- x * 2
print(x)

... the knitted version will say x equals 10, even though, locally in RStudio, it's still equal to 5.

Does this help?

eneleven11 commented 3 years ago

Oh! That makes sense! Thank you!

So I'm guessing it would be best practice to clear the global environment every time a new project/lab is done?

lecy commented 3 years ago

You can think of the global environment as the rough draft of your data recipe - it's where you experiment.

Your R script or RMD doc is your final version that should have all of the steps in the recipe. You don't want to start with the instructions for mixing ingredients but forget to include the step where you specify the amount of each ingredient in the script and just use whatever amount of flour was leftover from last time you were baking. R tries to be clever in guessing inputs when you are not explicit, which is helpful a lot of times but can be a disaster at other times when it guesses the wrong values and your analysis is completely wrong as a result.

In general it's best to never save active R environments because it means new sessions of R will open with residual objects from previous sessions. This is a good way to introduce errors into your computations.

Once you get into more complicated scripts you might need to assign a handful of values as placeholder values for arguments so that you can test the script. You don't want these to linger because they are likely not the values you want to use in your final version.

These are called "ghosts in the machine" - instances where the program is running but the results are not what is expected because the program is doing something unexpected - like using old objects from previous sessions.

jamisoncrawford commented 3 years ago

Thanks @lecy!