Open buzwells opened 8 years ago
violation.data <- read.csv("C:/Users/dwanc_000/Documents/aRockhurst/Project - Property Violations/Property_Violations.csv")
violation.data.closed <- subset(violation.data, (violation.data$Status == "Closed")) #create df closed violations only violation.data.closed_over60 <- subset(violation.data.closed, (violation.data.closed$Days.closed > 60)) #create df closed violations only violation.data.closed_over60_under500 <- subset(violation.data.closed_over60, (violation.data.closed_over60$Days.closed < 500)) #create df closed violations only
violation.data.closed_over60_under500 <- subset(violation.data.closed_over60_under500,,-c(Property.Violation.ID, Case.ID, Status, Case.Closed.Date, Ordinance.Number, Ordinance.Chapter))
violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Case.closeded.Date),] #delete NA data for key vars violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Days.closed),] #delete NA data for key vars violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Violation.Code),] #delete NA data for key vars violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Violation.Description),] #delete NA data for key vars violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Violation.Entry.Date),] #delete NA data for key vars violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Address),] #delete NA data for key vars violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$County),] #delete NA data for key vars violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$State),] #delete NA data for key vars violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Zip.Code),] #delete NA data for key vars violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$KIVA.PIN),] #delete NA data for key va violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Council.District),] #delete NA data for key va violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Police.Patrol.Area),] #delete NA data for key va violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Inspection.Area),] #delete NA data for key va violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Code.Violation.Location),] #delete NA data for key va violation.data.closed_over60_under500 <- violation.data.closed_over60_under500[!is.na(violation.data.closed$Neighborhood),] #delete NA data for key vars
write.csv(violation.data.closed_over60_under500, file = "C:\Users\dwanc_000\Documents\aRockhurst\Project - Property Violations\closed_violations_over60_under500.csv")
Reviving this issue. Based on a discussion at our April 11, 2016 meeting, we will start loading the raw data by reading it from the open data site using the RSocrata package that Eric demonstrated. We will also scrub the data for obvious problems (for instance, missing data). Subsequent steps, such as identifying and eliminating outliers and joining the GEOID, are covered in separate issues. The group also agreed to store the data in rdf format for the sake of efficiency and compactness.
I can take a stab at this, as I've got scripts in my own workspace that already do much of this.
Created this pull request re this issue: https://github.com/codeforkansascity/Property-Violations-Settlement/pull/53
Recommend using the RSocrata package by Chicago. It's available on CRAN. https://github.com/Chicago/RSocrata