Open AhmedRashwanASU opened 3 years ago
Hi @AhmedRashwanASU - so the first sub-question here is really asking for a proportion of a subset of the data. You've got this hot mess of tax parcels in your dataset, but you're really looking for a subset of commercial-only parcels.
Let's look at what you're doing now:
proportion <- mean(downtown$amtdelinqt > 0 & downtown$landuse == "Commercial" ,na.rm = TRUE )
proportion*100
[1] 6.426735
While this is pretty fire for early in your R coding career, you're not quite getting the right proportion, and that's because you're looking at a proportion out of all properties instead of commercial properties. That's because downtown
here is checked against your conditions == "Commercial"
and > 0
, and downtown
contains everything!
Let's get weird. Let's say we want to create an entirely new dataset, and this will be a subset of the downtown
dataset. How? Well, the same way we use assignment and set up conditional statements.
com_props <- downtown[downtown$landuse == "Commercial", ]
Recall that these brackets [ ]
are powerful notation for subsetting data. Left of the comma ([ here , ]
) are the rows you want to keep - in this case, all rows where landuse
equals "Commercial". To the right of the comma ([ , here]
) are the columns you want to keep. And we want to keep them all, so we leave it blank.
We've stored this in object com_props
. Now, you can find which rows/observations in com_props
are tax delinquent and use mean()
or some other such method to get to the right answer - you do the same for the second part of this question, as well!
As for printing a table as output - you've got the right method though you could get a bit more precise.
Say we wanted to pull crosstabs for mtcars
based on cyl
(cylinder type) and mpg
greater than 28 (miles per gallon).
table(mtcars$cyl, mtcars$mpg > 28)
FALSE TRUE
4 7 4
6 7 0
8 14 0
More or less the same thing - well, you could cast this table()
output as a data.frame like so:
data.frame(table(mtcars$cyl, mtcars$mpg > 28))
Var1 Var2 Freq
1 4 FALSE 7
2 6 FALSE 7
3 8 FALSE 14
4 4 TRUE 4
5 6 TRUE 0
6 8 TRUE 0
Far out.
Let's name it for convenience.
Now we can use the same trick we did with the above subsetting (using brackets) to say what rows and columns we want to keep. Remember that in [ , ]
, rows you want to keep are indicated to the left of the comma and columns are to the right.
car_tabs <- data.frame(table(mtcars$cyl, mtcars$mpg > 28))
car_tabs[car_tabs$Var2 == TRUE, ]
Var1 Var2 Freq
4 4 TRUE 4
5 6 TRUE 0
6 8 TRUE 0
Here, we're only keeping rows where the second variable is equal to TRUE
. Let's get rid of that column now that we don't need it.
car_tabs2 <- car_tabs[car_tabs$Var2 == TRUE, ]
car_tabs2[ , c(1, 3)]
Var1 Freq
4 4 4
5 6 0
6 8 0
There we have it - a bit of a neater table. We could just go nuts for the hell of it.
colnames(car_tabs3) <- c("Cylinders", "Count")
rownames(car_tabs3) <- NULL
car_tabs3
Cylinders Count
1 4 4
2 6 0
3 8 0
Hope this helps!
The explanation and expansion for creating a table were incredibly useful! I was able to follow along and use it in my assignment until the very end. I'm not going to include it in my assignment but I enjoyed the practice!
@jamisoncrawford Thank you for the detailed Explanation for this Topic, hope that I didn't mess the correct way to find the final answers
The first answer is tax-delinquent commercial properties overall commercial properties
Tax_Delinquent_Commercial <- downtown[downtown$landuse == "Commercial", ]
proportion <- mean(Tax_Delinquent_Commercial$amtdelinqt > 0 ,na.rm = TRUE )
proportion*100
[1] 11.96172
The second answer is the tax dollars owed by commercial properties (a subset) overall tax dollars owed
Tax_dollars_Commercial <- downtown[downtown$landuse == "Commercial", ]
proportion_Tax <- sum(Tax_dollars_Commercial$amtdelinqt ,na.rm = TRUE )
sum_downtown <-sum(downtown$amtdelinqt, na.rm = TRUE)
proportion_Tax/sum_downtown*100
[1] 86.95747
@sjone128 super glad you read this and found it helpful!
@AhmedRashwanASU you got it! But I was trying to be somewhat vague as to not give away the answer 🤣. No worries - if folks read the thread, they will learn nearly just as well, I think.
Not Sure if I'm solving this in a proper way, is there any tips that can help to indicate more accurate results?
Question I: What proportion of commercial properties are delinquent on taxes?
Question II: What proportion of delinquent tax bills are owed by commercial parcels?
Use function: 'mean()' Use variable: 'amtdelinqt' Use variable: 'landuse'
The first answer is tax-delinquent commercial properties over all commercial properties
(Answer )
proportion <- mean(downtown$amtdelinqt > 0 & downtown$landuse == "Commercial" ,na.rm = TRUE )
proportion*100 [1] 6.426735
(Answer )
The second answer is the tax dollars owed by commercial properties (a subset) over all tax dollars owed
sum(downtown$amtdelinqt > 0 & downtown$landuse == "Commercial" ,na.rm = TRUE ) [1] 25
**Question 8: Tax Delinquent Parcels by Land Use
Question: How many of each land use type are delinquent on taxes? Print a table of your results.**
Use function: 'table()' Use variable: 'amtdelinqt' Use variable: 'landuse'
(Answer )
table(downtown$landuse , downtown$amtdelinqt > 0 )
Apartment 6 0 Commercial 184 25 Community Services 15 2 Industrial 2 2 Parking 62 16 Parks 8 0 Recreation 5 0 Religious 6 0 Schools 4 0 Single Family 1 0 Utilities 6 0 Vacant Land 33 12