lecy / foundations-of-data-science-for-the-public-sector

Lecture notes and labs for an introductory data science course for public policy and nonprofit management students
1 stars 0 forks source link

installing packages issue #2

Open lindsaymikel opened 6 years ago

lindsaymikel commented 6 years ago

I follow the instructions to download the Lahman package with the following code:

install.packages( "Lahman" )

I continually have received this:

The downloaded binary packages are in
    /var/folders/97/r8jjhgb1541d9pghl0hx_b840000gn/T//Rtmp7oqKdF/downloaded_packages

I expected from the lecture 00 notes to find this:

package ‘Lahman’ successfully unpacked and MD5 sums checked

I tried the following command to see if it downloaded:

ls( "Lahman" )

I received:

Error in as.environment(pos) : no item called "Lahman" on the search list

Can someone help me? I'm not sure what I'm doing wrong. Thanks!

lecy commented 6 years ago

I just installed the Lahman package and got this message:

> install.packages( "Lahman" )
Installing package into ‘C:/Users/jdlecy/Documents/R/win-library/3.4’
(as ‘lib’ is unspecified)
--- Please select a CRAN mirror for use in this session ---
trying URL 'https://mirrors.sorengard.com/cran/bin/windows/contrib/3.4/Lahman_6.0-0.zip'
Content type 'application/zip' length 7626978 bytes (7.3 MB)
downloaded 7.3 MB

package ‘Lahman’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
        C:\Users\jdlecy\AppData\Local\Temp\Rtmpims5Kk\downloaded_packages

The last line is similar to what you are receiving.

The ls() function lists data sets that have been loaded (or other objects in your environment). It will not load a package, though.

Try:

library( "Lahman" )

The Lahman package is a strange package because it consists of data sets only, no other functions. After it is loaded, you can check on available datasets like this:

ls()  # list loaded datasets - there are none
# character(0)
data( package="Lahman" )    # shows all available datasets in Lahman
data( Teams )    # loads Teams dataset
ls()   # now you can see that the dataset is loaded in your environment
# [1] "Teams"
head( Teams )   # print first six rows of the dataset
lindsaymikel commented 6 years ago

I still seem to be missing something:

> install.packages( "Lahman" )
--- Please select a CRAN mirror for use in this session ---
trying URL 'https://cran.cnr.berkeley.edu/bin/macosx/mavericks/contrib/3.3/Lahman_6.0-0.tgz'
Content type 'unknown' length 7626828 bytes (7.3 MB)
==================================================
downloaded 7.3 MB

The downloaded binary packages are in
    /var/folders/97/r8jjhgb1541d9pghl0hx_b840000gn/T//RtmpWWCHsm/downloaded_packages

Nothing happens when I enter in the following:

> library( "Lahman" )
> data( "Lahman" )

I do not receive the following:

package ‘Lahman’ successfully unpacked and MD5 sums checked
lecy commented 6 years ago

Try:

library( "Lahman" )
data( package="Lahman" )
data( "Teams" )
ls()

Does it load the Teams dataset?

lindsaymikel commented 6 years ago

This seems to have worked?

> install.packages( "Lahman" )
--- Please select a CRAN mirror for use in this session ---
trying URL 'https://cran.cnr.berkeley.edu/bin/macosx/mavericks/contrib/3.3/Lahman_6.0-0.tgz'
Content type 'unknown' length 7626828 bytes (7.3 MB)
==================================================
downloaded 7.3 MB

The downloaded binary packages are in
    /var/folders/97/r8jjhgb1541d9pghl0hx_b840000gn/T//RtmpuOpU71/downloaded_packages
> library( "Lahman" )
> data( package="Lahman" )
> data( "Teams" )
> ls()
[1] "Teams"

I'm not sure where to go from here?

lecy commented 6 years ago

It worked!

The first lesson is that you can load datasets using the data() command.

From there you need to figure out how to manipulate these datasets.

Read Lecture 01 and pay attention to these functions, especially :

Functions for the Day

Navigation

dir()                            # list all files in the dir.
get.wd()                         # get the current working dir.
set.wd( “path” )                     # set the working directory
ls()                             # list all active objects

Packages

install.packages( “package.name” )     # install a package
library( “package.name” )              # load a package
data( “dataset.name” )                 # load a dataset
names( dataset.name )               # lists variables within a dataset
head( dataset.name )                 # prints first six lines
edit( dataset.name )                   # note no quotes

Dataset Characteristics

names( data.frame )         # which variables are in the set?
nrow( data.frame )          # how many observations are there?
dim( data.frame )           # prints number of rows and columns
summary( data.frame )           # print descriptive stats 
table( categoric.variable )     # counts 

Accessing a Variable within a Dataset

$ - data.frame$variable.name,
eg USArrests$Murders

lindsaymikel commented 6 years ago

Great. I just wasn't sure if it loaded since I didn't receive this magical message:

package ‘Lahman’ successfully unpacked and MD5 sums checked

Maybe it's because mine isn't the most updated version

lecy commented 6 years ago

I was getting ahead by a week - for this week you are summarizing variables.

Note that each data set consists of a set of variables:

names( Teams )

To reference a specific variable you need to $ operator:

Teams$teamID

To get summary statistics you can use the summary() function for numeric variables, and the table() function for categorical variables.

summary( Teams$ERA )
table( Teams$yearID )