Closed rudeboybert closed 8 years ago
@mine-cetinkaya-rundel and I were discussing the same thing and couldn't come up with anything. I agree that having some sort of "let's load in the data" command is helpful, since otherwise we undermine the "you can only refer to objects that are in your environment" line. That promise thing bugs me though. I guess doing
library(openintro)
data(county)
View(county)
is one solution, though that capital V really bugs me too! Looks like it's in the base utils
package. We could write a view()
function, but then we run into that problem of breaking R for students the first time they try to load data without the openintro
package.
Are there any other good alternatives to View()
? str()
?
I would prefer to leave View()
out of the labs. I don't teach View()
as a function to be typed ever, because it doesn't work in an R Markdown file.
We need to leave it out of the lab so that I can later push for adding instructions to the labs for the students to use R Markdown for their reports :) (Labs I use in my class are always modified versions of OpenIntro labs that also contain some instructions on writing up the report in R Markdown.)
Yes, I've been wrestling with this as well. I made the mistake of typing the name of my dataset to get it to print to the console in the first week of class, and now students see that as a sort of voodoo for getting the <Promise>
to go away.
This is generally okay (R terminates after 1000 rows in the console), but it has led to them putting the same thing into their labs, which quickly becomes a Bad Thing as RMarkdown to HTML will balk at "big" data and the error message is hard for them to interpret.
Some of this is pedagogical, of course (I should have shown head(arbuthnot)
as my first step on day one), but it would be great if students could somehow run data(arbuthnot)
and have it load automatically. Maybe this is actually an RStudio feature request? Some sort of preference that would specify that data shouldn't be lazy-loaded in that way, and could be unchecked when not doing education stuff.
I don't think RStudio will (or should) be in the business of changing the behavior of core R functions like data(). And I don't expect that core R is eager to change the behavior of data()
. Here are some choices:
dim()
.head()
glimpse()
I generally use one of those three each time I encounter a data set in class.
Another option would be to write a function something like fetchData()
which used to be in the mosaic package (and now lives in the fetch
package. fetchData()
supports an assignment syntax:
foo <- fetchData(county)
and is designed to pull data from packages, but also from files and URLs.
All good thoughts. I'm inclined to go with
data(county)
head(county)
for the general OpenIntro version of the labs. Something like str()
is more informative, so maybe I'll use that instead of head()
in my own labs, but for a broader audience, it might be difficult to sift through all that information.
We opted to change the package so that the data isn't lazy loaded. Should be ok since none of the data is too large. See oilabs commit.
The most intuitive way to me is
data.frame(data(iris)) You will see the object "iris" in the Environment.
Hello,
Thank you very much for sharing this knowledge! It helped me a lot. I used data(diamonds) command to install "diamonds" dataset. However, I was not able to open the dataset and can only see
("diamonds
Thank you very much!
@beanumber and I have discussed this prior.
If you load a data set via
data()
in RStudio, all you get in the environment panel is a<Promise>
of the data set i.e. if you click the variable name, the Data Viewer spreadsheet does not pop up. It is only after you run some function on the object that you get this ability. Students are extremely puzzled by this fact. Ex:I'm starting to lean more on the Data Viewer than commands like
names()
,head()
, ortail()
as it gets students close to the data with all layers of abstraction removed. Also students love the sort and filter functionality.One could argue that:
data()
is unnecessary. However, I still like explicitly having students rundata()
to make things seem less magical.View()
command, but I find clicking on the variable name in the Environment panel quicker. And why the uppercase V?Do any of you know of a solution to this problem? i.e. students run
data()
and should immediately be able to load the Data Viewer by clicking the variable name in the Environment panel? A cursory google search seems to suggest the lazy eval nature of R might preclude a solution.