Watts-College / paf-513-template

https://watts-college.github.io/paf-513-template/
MIT License
0 stars 0 forks source link

Homework code help #5

Open tnt2501 opened 10 months ago

tnt2501 commented 10 months ago

@JasonSills Good afternoon,

I am currently struggling on the lab starting at question 6. I have the current code

proportion_delinquent <- mean(downtown$amtdelingt > 0, na.rm = TRUE)

cat("Proportion of parcels with delinquent tax payments:", proportion_delinquent, "\n")

I am getting an answer of NaN. Same thing for question 7.

Do you have any guidance or insight on what I can do differently to correct it? Thank you in advance!

JasonSills commented 10 months ago

@tnt2501

For #6 you only need use the mean function:


mean(downtown$amtdelinqt != 0,
     na.rm = TRUE)                         

For question 7 you need to first create a subset of commercial properties then find the proportion that are delinquent on their taxes:


index <- downtown$landuse == "Commercial"   # Create logical vector of 'TRUE' if "Commercial"

com_prop <- downtown[index, ]               # Index by row position and subset from 'downtown' - note the placement of "index, " inside [], this is returning row values of the subsect created with index

mean(com_prop$amtdelinqt != 0,
     na.rm = TRUE)                          #now you run the same function as #6 on the subset you've created
tnt2501 commented 10 months ago

Thank you!

swest235 commented 10 months ago

@JasonSills could you help me resolve the second part of Q7? I can't seem to get this right. I've tried some different things but this is where I am at so far...it is giving me 86%, that seems WAY too high. I think I am overcomplicating this.

Question 7, part two:

sum(downtown$amtdelinqt) [1] 5045969 commercial_properties <- downtown[downtown$landuse == "Commercial", ] total_delinquent_amount <- sum(commercial_properties$amtdelinqt) sum(commercial_properties$amtdelinqt) [1] 4387847 total_delinquent_amount/sum(downtown$amtdelinqt) [1] 0.8695747

JasonSills commented 10 months ago

@swest235 Part II is exactly like Part I, but in reverse. In Part I we want to know of those that are commercial, what proportion are delinquent? In this one we want to know of those that are delinquent, what proportion are commercial? So we can use the same code as Part I, be we want to move around the variable:


index <- downtown$amtdelinqt != 0           # Create logical vector of 'TRUE' if taxes owed != 0 - this finds all those that are delinquent. 

dlq_prop <- downtown[index, ]               # Index by row position and subset from 'downtown' - this creates the subset with all rows that are delinquent. 

mean(dlq_prop$landuse == "Commercial",
     na.rm = TRUE)                          # Use new subset to determine % of commercial properties - now we find the proportion of all commercial properties in the subset of those that are delinquent. 
swest235 commented 10 months ago

@JasonSills doesn't this just give the proportion tax bills owed by commercial properties over tax bills owed in general?

Ultimately it gives us 25/57; but the answer asks for dollars which would be a different answer, wouldn't it?

JasonSills commented 10 months ago

I see the confusion. Q1 is very clear:, but the language for the answer is not. Q1: What proportion of delinquent tax bills are owed by commercial parcels? However, in the text for the answer it reads proportion of tax dollars, rather than bills. You want to focus on the question for this one, don't answer the dollar amount. Sorry for the confusion.

Question 7: Tax Delinquent Commercial Properties

Question I: What proportion of commercial properties are delinquent on taxes?

Question II: What proportion of delinquent tax bills are owed by commercial parcels?


Answer I: [X]% of commercial properties are delinquent on taxes.

Answer II: [X]% of delinquent tax dollars are owed by commercial parcels.

swest235 commented 10 months ago

This makes much more sense, I see. Can you let me know if I was at all on the right track at least for answer the question in relation to dollars, as opposed to bills? It is the code in my original question. Moreso out of curiosity now.

JasonSills commented 10 months ago

@swest235,

So this 86% number is correct. Why is it so high? I assume it is because commercial properties are a much higher dollar values than non commercial. This week we have focused on logical vectors, which is why we are looking at counts. But try out crosstabs codes below as a way to check your math and display continuous data.

There are many ways to create crosstabs in R, but here is a basic one. This will return the dollar amount of amtdelinqt broken down by land use.

xtabs (downtown$amtdelinqt~downtown$landuse)

What about percentage? Try this:

prop.table(xtabs(downtown$amtdelinqt~downtown$landuse))