Lab 1, Step 3 - Githubissues

I seem to be running into a problem with step 3 in our first lab. Here is the code that i have come up with to try to pick a door to open:

open.goat.door <- function( game, a.pick ) {
  doors <- c( 1:3 )
  goat.to.door <- doors[ game == "goat" ]
  opened.door.possible <- goat.to.door[ goat.to.door != a.pick ]
  opened.door <- sample( opened.door.possible, size = 1 )
  return( opened.door )
}

The problem that I have run into is that I will open the same door that the contestant chose.

I put quotes around the numbers 1, 2, and 3 for doors making them factors instead of integers. It seems to have done the trick. I would still appreciate anything that should be obvious to me as a flaw in the current section of code.

The code looks fine. You could combine two steps into a compound logical statement:

game != "car" & doors != a.pick

Just need to remember the rules for combining cases. What would this evaluate to?

c( TRUE, FALSE, TRUE ) & c( TRUE, TRUE, FALSE )

# game is a character vector e.g. c("goat","car","goat")
# a.pick is a number between 1 and 3 

open.goat.door <- function( game, a.pick ) {
  doors <- c( 1:3 )
  # can't open a door with a car or the current selection 
  available.doors <- doors[ game != "car" & doors != a.pick ]
  opened.door <- sample( available.doors, size = 1 )
  return( opened.door )
}

The important thing here will be your unit tests. Try these out to see if everything is working.

game <- c("goat","car","goat")
a.pick <- 3
open.goat.door( game, a.pick )   # should be 1

game <- c("goat","car","goat")
a.pick <- 2
open.goat.door( game, a.pick )   # should be 1 or 3

game <- c("goat","car","goat")
a.pick <- 1
open.goat.door( game, a.pick )   # should be 3

Hi professor,

I have not quite got what you have proposed on the compounded logical evaluation.

 available.doors <- doors[ game != "car" & doors != a.pick ]

Although this compounded logical statement makes sense by itself, I fail to find the association between the door selection (both for the host and contestant) and game result. The puzzle seems to me that how does the first part of the code "doors[ game != "car" controls the door selection for the host. My solution is below and it seems to work:

open_goat_door <- function( game, a.pick ){ 
doors <- c(1,2,3) 

# get the 2 non-car doors (with goats)
goat.door <- doors[game != "car"]

# get the noncontestant-selected door for the host
goat.noncontestant.door <- goat.door[goat.door != a.pick]

# get the door randomly sampled
opened.door <- sample(goat.noncontestant.door, size = 1) 

return( opened.door ) 
}

Could you please elaborate more during the discussion session on Monday? Many thanks in advance!

@lghb2005

A couple of things here. First, think about game design as a behind-the-scenes producer of the show. You are correct that each step of the game is related to an action by different plays. The produce creates a new game in step 1. The contestant picks a door in step 2. The host opens a door in step 3.

But at each step the main question is what information would the person NEED to make the decision, and what information do they NOT have? That will determine which arguments the function should have. A contestant step, for example, would never need the game vector because the contestant would not use that information to make a decision.

For example, the host needs to know where the car is for step 3 so that he does not accidentally open the door with the car. He can't simply randomly select one of the two closed doors. In reality, he would probably have the car door number written on his card at the beginning of the game so he knows which he can open. But he is applying the logic, I won't open the door the contestant selected, and I can't reveal the car at this step.

You need to be very careful here because you are changing your vector sizes, which creates additional complexity in your code moving forward. Your solution works here, but it is very close to creating a serious problem related to recycling rules in R. The game vector, doors vector, and all logical statements are of length 3, so working with these together will ensure there are no recycling errors. Once you shorten a vector, though, you create the possibility of recycling errors that occur when vectors are of different lengths.

> goat.doors <- doors[ game != "car" ]
> goat.doors 
[1] 1 3

Note here how logical statements work. Each statement returns a vector of the same length, with either a TRUE or FALSE for each original item in the vector:

> doors <- c(1,2,3)
> game <- c("goat","car","goat")
> a.pick <- 3
> 
> doors != a.pick
[1]  TRUE  TRUE FALSE
> game != "car"
[1]  TRUE FALSE  TRUE

Note that compound statements will combine multiple criteria and still return a vector of the same length:

> doors != a.pick   &   game != "car"
[1]  TRUE FALSE FALSE

To see what's happening, you need just to recognize that a compound logical statement requires each criteria to be true in order to return TRUE for a specific value (door 1 in this case has not been selected yet and doesn't contain a car)

> criteria.1 <- doors != a.pick
> criteria.2 <- game != "car"
> join.criteria <- criteria.1 & criteria.2
> cbind( doors, game, a.pick, criteria.1, criteria.2, join.criteria )
     doors game   a.pick criteria.1 criteria.2 join.criteria
[1,] "1"   "goat" "2"    "TRUE"     "TRUE"     "TRUE"       
[2,] "2"   "car"  "2"    "TRUE"     "FALSE"    "FALSE"      
[3,] "3"   "goat" "2"    "FALSE"    "TRUE"     "FALSE"

So the host can determine which doors CAN be opened simply by asking which doors meet BOTH criteria:

> doors <- c(1,2,3)
> game <- c("goat","car","goat")
> a.pick <- 3
> 
> doors[ doors != a.pick & game != "car" ]
[1] 1
> # explicit orders of operation
> doors[ ( doors != a.pick )  & ( game != "car" )  ]
[1] 1
> 
> doors <- c(1,2,3)
> game <- c("goat","car","goat")
> a.pick <- 2
> 
> doors[ doors != a.pick & game != "car" ]
[1] 1 3

The reason I would caution changing vector lengths DURING THE GAME LOGIC STEPS is the possibility of introducing recycling errors. Again, your solution works above, but if the second criteria required using something other than door numbers you would be stuck. The compound logic version is much more robust, and in the long-run getting comfortable with logical statements will benefit you since it is the means of translating a human language question into computer code.

> x <- c(1,2,3)
> 
> x[ c(TRUE,FALSE,TRUE) ]  # want door 1 and 3
[1] 1 3
> 
> x[ c(TRUE,FALSE) ]  # recycling with 3 doors and incomplete selector
[1] 1 3
> x[ c(FALSE,TRUE) ]  # recycling with 3 doors and incomplete selector
[1] 2
> 
> 
> # using selection  
> # after shortening your vector 
> 
> x <- c(1,3)
> 
> x[ c(TRUE,TRUE,FALSE) ]  # want doors 1 and 2
[1] 1 3
> x[ c(TRUE,FALSE,TRUE) ]  # want doors 1 and 3
[1]  1 NA
> x[ c(FALSE,TRUE,TRUE) ]  # want doors 2 and 3
[1]  3 NA

I seem to be having the same problem as the original poster. It is sometimes opening the door the contestant chose or the car dooe. Not always, but often. I've tried hardcoding both the DoorChoice and the AssignDoors, but it still did it. My code is as follows:

OpenGoatDoor <- function(DoorAssignment, ChosenDoor)
{
  Doors <- c(1:3)
  AvailableDoors <- Doors[Doors != ChosenDoor & DoorAssignment == "Goat"]
  ChooseDoorToOpen <- sample(AvailableDoors, size = 1)
  return (ChooseDoorToOpen)
}

AssignDoors <- CreateGame()
AssignDoors
DoorChoice <- ChooseDoor()
DoorChoice
OpenGoatDoor(AssignDoors, DoorChoice)

@AprilPeck Your code looks good! Game logic is intact, functions look good.

There is one small, very subtle problem that I will let you sit with.

Go ahead with the rest of the game and come back to this step. Your lab will be fine if you can't figure it out - I'll explain it in the solutions.

Classmates are welcome to jump in if they have figured it out!

Two style suggestions.

Use lower case variable.names separated by periods for OBJECT NAMES (vectors or datasets here). Reserve camelCaps or under_score for FUNCTION NAMES.
ChooseDoorToOpen is a VERB. Use NOUNS for object names.

Very minor issues, but I promise that developing consistent style will make your code twice as easy to maintain as scripts get longer and more nuanced.

https://jef.works/R-style-guide/

@lghb2005

A couple of things here. First, think about game design as a behind-the-scenes producer of the show. You are correct that each step of the game is related to an action by different plays. The produce creates a new game in step 1. The contestant picks a door in step 2. The host opens a door in step 3.

But at each step the main question is what information would the person NEED to make the decision, and what information do they NOT have? That will determine which arguments the function should have. A contestant step, for example, would never need the game vector because the contestant would not use that information to make a decision.

For example, the host needs to know where the car is for step 3 so that he does not accidentally open the door with the car. He can't simply randomly select one of the two closed doors. In reality, he would probably have the car door number written on his card at the beginning of the game so he knows which he can open. But he is applying the logic, I won't open the door the contestant selected, and I can't reveal the car at this step.

You need to be very careful here because you are changing your vector sizes, which creates additional complexity in your code moving forward. Your solution works here, but it is very close to creating a serious problem related to recycling rules in R. The game vector, doors vector, and all logical statements are of length 3, so working with these together will ensure there are no recycling errors. Once you shorten a vector, though, you create the possibility of recycling errors that occur when vectors are of different lengths.
> goat.doors <- doors[ game != "car" ]
> goat.doors 
[1] 1 3
Note here how logical statements work. Each statement returns a vector of the same length, with either a TRUE or FALSE for each original item in the vector:
> doors <- c(1,2,3)
> game <- c("goat","car","goat")
> a.pick <- 3
> 
> doors != a.pick
[1]  TRUE  TRUE FALSE
> game != "car"
[1]  TRUE FALSE  TRUE
Note that compound statements will combine multiple criteria and still return a vector of the same length:
> doors != a.pick   &   game != "car"
[1]  TRUE FALSE FALSE
To see what's happening, you need just to recognize that a compound logical statement requires each criteria to be true in order to return TRUE for a specific value (door 1 in this case has not been selected yet and doesn't contain a car)
> criteria.1 <- doors != a.pick
> criteria.2 <- game != "car"
> join.criteria <- criteria.1 & criteria.2
> cbind( doors, game, a.pick, criteria.1, criteria.2, join.criteria )
     doors game   a.pick criteria.1 criteria.2 join.criteria
[1,] "1"   "goat" "2"    "TRUE"     "TRUE"     "TRUE"       
[2,] "2"   "car"  "2"    "TRUE"     "FALSE"    "FALSE"      
[3,] "3"   "goat" "2"    "FALSE"    "TRUE"     "FALSE" 
So the host can determine which doors CAN be opened simply by asking which doors meet BOTH criteria:
> doors <- c(1,2,3)
> game <- c("goat","car","goat")
> a.pick <- 3
> 
> doors[ doors != a.pick & game != "car" ]
[1] 1
> # explicit orders of operation
> doors[ ( doors != a.pick )  & ( game != "car" )  ]
[1] 1
> 
> doors <- c(1,2,3)
> game <- c("goat","car","goat")
> a.pick <- 2
> 
> doors[ doors != a.pick & game != "car" ]
[1] 1 3
The reason I would caution changing vector lengths DURING THE GAME LOGIC STEPS is the possibility of introducing recycling errors. Again, your solution works above, but if the second criteria required using something other than door numbers you would be stuck. The compound logic version is much more robust, and in the long-run getting comfortable with logical statements will benefit you since it is the means of translating a human language question into computer code.
> x <- c(1,2,3)
> 
> x[ c(TRUE,FALSE,TRUE) ]  # want door 1 and 3
[1] 1 3
> 
> x[ c(TRUE,FALSE) ]  # recycling with 3 doors and incomplete selector
[1] 1 3
> x[ c(FALSE,TRUE) ]  # recycling with 3 doors and incomplete selector
[1] 2
> 
> 
> # using selection  
> # after shortening your vector 
> 
> x <- c(1,3)
> 
> x[ c(TRUE,TRUE,FALSE) ]  # want doors 1 and 2
[1] 1 3
> x[ c(TRUE,FALSE,TRUE) ]  # want doors 1 and 3
[1]  1 NA
> x[ c(FALSE,TRUE,TRUE) ]  # want doors 2 and 3
[1]  3 NA

Hi Professor, tons of thanks for your explanations in much detail on my questions as well pointing out the recycling issue that I indeed have ignored! I am now in a better position of understanding how does the compound structure works and directs the game result. The compound logical evaluation actually links various conditions. To me, the "join. criteria" or the compound logical evaluation acts as a 2nd bridge by supplying the information needed for the host to make the decision.

DS4PS / cpp-527-spr-2021

Lab 1, Step 3 #1