kaijagahm / ygdpDashboard

Interactive dashboard for YGDP survey data
3 stars 0 forks source link

Reactlog: possible infinite loop w/r/t dat() #24

Closed kaijagahm closed 3 years ago

kaijagahm commented 3 years ago

When I was looking at the app's reactlog a while back (before implementing any of the INT mode features), I noticed in the reactivity graph that there might be some kind of infinite loop going on. Well, maybe not infinite, because it seems to resolve itself enough to have the app be functional. But there were all sorts of back and forth steps happening between the dat() object and the various demographic filter input objects.

I can't give a lot more detail without going back into the reactive graph. But to see if this is still there, you'd do the following:

  1. Open a brand new session of RStudio

  2. Add this code at the top of the app:

    library(reactlog)
    # tell shiny to log all reactivity
    reactlog_enable()
  3. Run the app. Let it fully open. Either don't do any clicking around, or just click "Apply" to send the default sentence through points mode. Then close the app.

  4. Back in the app script or in the R console, type shiny::reactlogShow(). This will open the reactlog in a new browser tab on your computer.

  5. Click through the reactlog step by step. Eventually, you'll see that there are a huge number of steps going back and forth from dat() to the various demographic filters (if I'm remembering correctly). I don't understand reactlog well enough to be able to figure out why that's happening or how to fix it. But it's very possible that fixing it will either correct a hidden bug or will just make the app run more efficiently.

To read more about how to use reactlog: https://rstudio.github.io/reactlog/

kaijagahm commented 3 years ago

From @ianneidel's email:

I also spent some time looking into the loop and am left without many conclusions…but think I do have a conceptually unsatisfying solution. If you instead create a new variable in rDat/dat’s reactive definition that takes on the value of isolate({nSentences()}) and pass that new variable to the n() == x filter, to R dat/rDat no longer depend on nSentences() although it is functionally the same. This solves the loop and saves .7-.8 seconds of time (in some runs it appears to save more like 2s for some unknown reason). Attached is a picture of the reactlog with the workaround, which was launched in just 1.3s and looks much more sensible. For me everything works as it had before.

As for why this works? I have no idea. If I had to guess, filter() might in some way override the normal function of isolate() in n() == x. Perhaps filter calls isolate({nSentences()}) for every datapoint it is applied to and that registers in the reactlog and takes more time than simply assigning isolate({nSentences()}) to a new variable and using that in the filter condition?? Anyway, I hope that works for you. Attached are the edits to the code I made for the workaround with the two changes bolded. I hope it's sufficient and works for you. If you get any insight into the cause of the loop though please let me know.

Code:

dat <- reactive({
    ns2 = isolate({nSentences()})
    surveyData()[leftRV$chosenSentences] %>% # select the chosen sentences from the survey data
      lapply(., as.data.frame) %>% # convert each sentence to a df
      map2(., # first input: the list of df's fed in from the pipe above
           1:length(.), # second input: a vector of numbers from 1 to length(.)
           ~mutate(..1, # add a column called "whichSentence" to each df, with the corresponding number. This works because `surveyData()[leftRV$chosenSentences]` (above) subsets those sentences from the list *in order*
                   whichSentence = paste0("sentence", ..2))
           ) %>%
      # Filter by ratings for each sentence
      map2(., # first input: the list of df's fed in from the pipe above
           leftRV$chosenRatings, # second input: the chosen ratings for each sentence
           ~filter(..1, rating %in% as.numeric(..2))) %>% # filter each data frame by the ratings the user has selected *for that sentence*
      bind_rows(.id = NULL) %>% # bind list into a single df, filtered by ratings.
      mutate(rating = as.numeric(as.character(rating))) %>% # convert ratings to numeric

      # remove participants who don't meet criteria for all sentences (this is the `AND` stack)
      group_by(responseID) %>% 
      filter(n() == ns2) % only keep participants who have as many rows as there are sentences, i.e. not fewer.
      ungroup() %>%

      #Filter by demography
      ## each {} is a conditional pipe statement.
      ## In the filtering statements, we explicitly keep the NA's. NA's are removed in the later rightRV$*NAs statements.
      {if(is.null(rightRV$ageButtons)) # if the user filtered age using the slider…
        filter(., is.na(age) | age >= rightRV$ageSlider[1] & 
                 age <= rightRV$ageSlider[2]) else .} %>%
      {if(is.null(rightRV$ageSlider)) # if the user filtered age using buttons…
        filter(., is.na(ageBin) | ageBin %in% rightRV$ageButtons) else .} %>%
      filter(is.na(gender) | gender %in% rightRV$gender) %>%
      filter(is.na(raceCats) | raceCats %in% rightRV$race) %>%
      filter(is.na(education) | education %in% rightRV$education) %>%

      # Remove NA's if the checkboxes are unchecked
      {if(rightRV$ageNAs == F) filter(., !is.na(age)) else .} %>% # since ageBin is derived from age, don't need a separate test here for which age filter tab is selected--all rows that are NA for age will also be NA for ageBin.
      {if(rightRV$genderNAs == F) filter(., !is.na(gender)) else .} %>%
      {if(rightRV$raceNAs == F) filter(., !is.na(raceCats)) else .} %>%
      {if(rightRV$educationNAs == F) filter(., !is.na(education)) else .}
  },
  label = “rDat")

I made this change and committed it: https://github.com/kaijagahm/ygdpDashboard/commit/f753eeb8535cbc895424f148871632dc07108e86