Watts-College / cpp-528-spr-2022

https://watts-college.github.io/cpp-528-spr-2022/
0 stars 0 forks source link

Week 3 Data Cleaning #9

Closed jmacost5 closed 2 years ago

jmacost5 commented 2 years ago

Hello Everyone I am stuck on the first part of the lab I keep getting errors in my code to clean the data.

#
# Author:     Jesse Lecy
# Maintainer: Cristian Nuno
# Date:       March 21, 2021
# Purpose:    Create custom functions to pre-process the LTDB raw data files
#

# load necessary functions ----
# note: all of these are R objects that will be used throughout this .rmd file
import::here("build_year",
             "RELEVANT_FILES",
             "obtain_crosswalk",
             "create_final_metadata_file",
             # notice the use of here::here() that points to the .R file
             # where all these R objects are created
             .from = here::here("utilitieslabs/wk03/utilities.R"),
             .character_only = TRUE)
Error in loadNamespace(name) : 
  there is no package called ‘/Users/jestrii98/utilitieslabs/wk03/utilities.R’

# for each relevant file, run the build_year() function ----
# note: this populates the data/rodeo/ directory with clean files
for (relevant_file in RELEVANT_FILES) {
  print(paste0("Starting on ", relevant_file[["year"]]))
  build_year(fn1 = relevant_file[["fullcount"]],
             fn2 = relevant_file[["sample"]],
             year = relevant_file[["year"]])
  if (relevant_file[["year"]] < 2010) {
    print("Finished! Moving onto the next decade.")
  } else {
    print("Finished! No more data to parse.")
  }
}
[1] "Starting on 1970"
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :

 Error in file(file, "rt") : cannot open the connection > 
# load the crosswalk ----
# note: this stores a copy in the data/raw/ directory
cw <- obtain_crosswalk()
Error in gzfile(file, mode) : cannot open the connection
In addition: Warning message:
In gzfile(file, mode) :

 Error in gzfile(file, mode) : cannot open the connection > 
# create the final meta data file ----
# note: this stores a copy within the data/rodeo/ directory
create_final_metadata_file(file_names = RELEVANT_FILES,
                           crosswalk = cw)
Error in dplyr::select(crosswalk, -countyname, -state) : 
  object 'cw' not found

# end of script #

(This comment has been edited for improved readability. ~@yukicruz )

u12345 commented 2 years ago

@jmacost5 , please read the following page, it specifically says about your first error.
https://watts-college.github.io/cpp-528-spr-2022/labs/lab-03-tutorial.html

u12345 commented 2 years ago

@jmacost5, try fs::dir_tree() to directory structure for the project.

Johaning commented 2 years ago

@jmacost5, try fs::dir_tree() to directory structure for the project.

Can you elaborate on how/where to use fs::dir_tree()? I'm having the same problem (just wrote my question as issue #11. Thank you!

u12345 commented 2 years ago

When you knit a RMD file, the file has to be in top level directory structure, if the RMD file in a low level directory structure some people are getting an error. By issuing that command you are making sure that your file is in the right directory. Ujitha


From: johaning @.> Sent: Sunday, April 3, 2022 3:17 PM To: Watts-College/cpp-528-spr-2022 @.> Cc: Ujitha Senanayake @.>; Comment @.> Subject: Re: [Watts-College/cpp-528-spr-2022] Week 3 Data Cleaning (Issue #9)

@jmacost5https://github.com/jmacost5, try fs::dir_tree() to directory structure for the project.

Can you elaborate on how/where to use fs::dir_tree()? I'm having the same problem (just wrote my question as issue #11https://github.com/Watts-College/cpp-528-spr-2022/issues/11. Thank you!

— Reply to this email directly, view it on GitHubhttps://github.com/Watts-College/cpp-528-spr-2022/issues/9#issuecomment-1086931118, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AD6Q7XWJPM2OAQLSRGNFR2TVDHVCZANCNFSM5SHCMDFA. You are receiving this because you commented.Message ID: @.***>

Johaning commented 2 years ago

I'm still stuck on this. Running fs::dir_tree() shows my directory structure but I'm not sure what do do with that.

The first section of code in the project_data_steps.R file has a line: .from = here:here( "labs/utilities.R") I think this should instead be "labs/wk03/utilities.R" since I saved the utilities.R file into the wk03 folder, not just the labs folder.

When I run it as written, I get: Error in loadNamespace(name) : there is no package called ‘C:/Users/johan/RStudio-all/CPP-528/labs/utilities.R’ Error in obtain_crosswalk() : could not find function "obtain_crosswalk" etc

When I change to "labs/wk03/utilities.R" the global environment now contains cw, relevant_file, and RELEVANT_FILES which makes me think it is working at least part of the way, but I get the error message Error in file(file, "rt") : cannot open the connection In addition: Warning message: In file(file, "rt") : cannot open file 'C:/Users/johan/RStudio-all/CPP-528/data/raw': Permission denied

I've been googling about the permission denied warning, but it quickly gets above my head about changing windows permissions files.

yukicruz commented 2 years ago

@Johaning,

Are the two raw harmonized census data sets files unzipped and uploaded into your raw directory?

Link to zip files: https://watts-college.github.io/cpp-528-spr-2022/labs/lab-02/

Additionally, those files should be uploaded into your Team's repo ...data/raw/ directory

yukicruz commented 2 years ago

After you download utilities.R and _project_datasteps.R, you should save both within labs/wk03/ per the instructions here.

Use fs:dir_tree()to help confirm the location of the above two files. Your .Rmd and .html files will eventually join the above two files in the same folder (labs/wk03/).