DS4PS / course_website

https://ds4ps.github.io/course_website/
0 stars 0 forks source link

Lab 11. Adding map tile not working correctly #22

Open BissKuttner opened 5 years ago

BissKuttner commented 5 years ago

When I plot only for my variable of interest (for example pedestrian.cyclist) I get a map that looks reasonable. When I add in the map tiles, I don't think it is still plotting for my variable of interest because it looks the same for different variables. For accidents involving pedestrians or cyclists I have:

d2 <-
  dat %>%
  filter(pedestrian.cyclist == TRUE)
# {r, fig.width=4, fig.height=8}
par( mar=c(0,0,4,0) )

plot( d2$Longitude, d2$Latitude, pch=19, 
      cex=0.25*d2$pedestrian.cyclist, 
      col=alpha( "firebrick", alpha=0.5),
      main="Traffic Accidents involving pedestrians and cyclists Tempe, AZ",
      xlab="", ylab="",
      bty="n", axes=F )
# I dont's think this is plotting the same data points as the map of pedestrian or cyclist accidents above.
qmplot( Longitude, Latitude, data=d2, maptype="toner-lite", zoom=14, 
        size=I(point.size), color=I("firebrick"), alpha = I(0.1) )
lecy commented 5 years ago

How are you defining the variable pedestrian.cyclist?

The main difference is that you are scaling the two maps using different variables. In the first case you are using:

cex=0.25*d2$pedestrian.cyclist

In the second case you are using:

size=I(point.size)

The point.size variable was from your original dataset (number of injuries or fatalities) so it will operate differently, omitting any accidents with pedestrians or bikers that didn't result in injuries or fatalities since the scale will be zero, meaning they are invisible.

BissKuttner commented 5 years ago

Your answer is confusing to me. Maybe I am not creating my variables correctly. I have not accounted for whether there are injuries or fatalties or not as the directions seemed to suggest we were looking as accidents whether or not there is are injuries or fatalities.

Are the variables Unittype_One and Unittype_Two only indicating the unit type when there is an injury or fatality? That was not clear when I looked at the variable description.

Here is how I was defining pedestrian.cyclist, and also a varaiable for accidents involving cars (driver and driverless):

dat <-
  dat%>%
  mutate(pedestrian.cyclist = Unittype_One == "Pedestrian"|
           Unittype_One == "Pedalcyclist" |
           Unittype_Two == "Pedestrian" |
           Unittype_Two == "Pedalcyclist",
         car = Unittype_One == "Driver" |
           Unittype_One == "Driverless" |
           Unittype_Two == "Driver" |
           Unittype_Two == "Driverless")

count( dat, pedestrian.cyclist, car)
lecy commented 5 years ago

Your first statement looks fine:

pedestrian.cyclist = Unittype_One == "Pedestrian"|
           Unittype_One == "Pedalcyclist" |
           Unittype_Two == "Pedestrian" |
           Unittype_Two == "Pedalcyclist"

It should be equivalent to:

drv.ped = (Unittype_One %in% c("Pedalcyclist", "Pedestrian") | 
               Unittype_Two %in% c ("Pedalcyclist", "Pedestrian")

The second statement, though, is different than the statement above:

drv.drv = ( Unittype_One == "Driver" & Unittype_Two == "Driver" )

This is for two reasons. First, there is a category you are not considering, which is nothing listed for driver 2. These are cases with single car crashes. Second, the statement above is using the & operator, and you are using the | operator.

The & operator creates a narrow criteria where the statement will capture only cases where both accidents involved a driver, so accidents between two cars (no single driver accidents, and no accidents with cyclists or pedestrians), and cases with humans in control (no cases with driverless vehicles).

In your case, since you defined car as a series of OR statements, it will include cases with driver1 as a driver and driver2 as a cyclist or pedestrian (since OR means that either driver1 OR driver 2 has to be a driver), so these will essentially capture all of the cases.

car = Unittype_One == "Driver" |
           Unittype_One == "Driverless" |
           Unittype_Two == "Driver" |
           Unittype_Two == "Driverless"

You can use the narrow statement above (driver1 & driver2 are "drivers"), or say things like (pseudocode only):

driver1 == "driver" & driver2 ! %in% c("pedestrian","cyclist")
lecy commented 5 years ago

To answer your original question regarding plot() vs qmplot() differences, simply replace:

size=I(point.size)

With:

size=I(pedestrian.cyclist)

Eg.

qmplot( Longitude, Latitude, data=d2, maptype="toner-lite", zoom=14, 
        size=I(pedestrian.cyclist), color=I("firebrick"), alpha = I(0.1) )
BissKuttner commented 5 years ago

I have installed the code you sent into R, installed Rtools 3.5, had to reinstall all my packages, and now the map tile is no longer working at all. (It was before).

After running code that successfully plots pedestrian.cyclist without map tiles, I am now getting the error message below when I run this code to drop the map tiles behind the plot:

qmplot( Longitude, Latitude, data=d2, maptype="toner-lite", zoom=14, 
        size=I(pedestrian.cyclist), color=I("firebrick"), alpha = I(0.1) )

ERROR MESSAGE:
Error in structure(x, class = unique(c("AsIs", oldClass(x)))) : object 'pedestrian.cyclist' not found

Suggestions???

lecy commented 5 years ago

I am guessing it's because the example names the new dataset "d2", and you kept the original name "dat". Try:

dat <-
  dat %>%
  mutate( pedestrian.cyclist = Unittype_One == "Pedestrian"|
           Unittype_One == "Pedalcyclist" |
           Unittype_Two == "Pedestrian" |
           Unittype_Two == "Pedalcyclist" )

qmplot( Longitude, Latitude, data=dat, maptype="toner-lite", zoom=14, 
        size=I(pedestrian.cyclist), color=I("firebrick"), alpha = I(0.1) )

This part of the error is the important part:

r object 'pedestrian.cyclist' not found

BissKuttner commented 5 years ago

I restarted the entire lab. So far so good.

BissKuttner commented 5 years ago

Is there a way to put the facet_wrap map for my variable cyl.ped next to the facet map of drv.drv? I can not figure that out.

lecy commented 5 years ago

I don't understand the question.

The facet_wrap() component of the graph accepts a factor as the primary argument, and constructs the same graph for data from each group defined by the factor. cyl.ped and drv.drv are just two levels from the same factor (if defined as mutually exclusive), so they should be represented by TRUE and FALSE.

You can define the group variable (the factor) however you want, and facet_wrap() will plot one graph for each level. You can also plot two separate factors together:

facet_wrap( ~ age.groups + time.groups )

But it won't make sense if the factors are mutually exclusive groups already.

short <- height < 60 inches
tall <- height >= 60 inches

Because the categories TRUE, TRUE or FALSE, FALSE would have no data.

BissKuttner commented 5 years ago

OK. I think I understand that.