k5cents / whatr

Read Jeopardy game data in R
https://kiernann.github.io/whatr/
GNU General Public License v3.0
9 stars 4 forks source link

Changing regex in whatr_answers for final_answer #2

Closed tbradley1013 closed 4 years ago

tbradley1013 commented 4 years ago

The final_answer parsing was returning NA. It looks like a slight modification in the regex is all that was needed. Here is a PR that implements that change. Below is a reprex showing the result of the change:

library(whatr)

game <- whatr_html(x = 1000, out = "showgame")

final_answer_current <- game %>%
  rvest::html_node(".final_round") %>%
  base::as.character() %>%
  stringr::str_split("class") %>%
  base::unlist() %>%
  stringr::str_subset("correct_response") %>%
  stringr::str_extract("(?<=i&gt;)(.*)(?=&lt;/i&gt;)") %>%
  stringr::str_to_title()

final_answer_new <- game %>%
  rvest::html_node(".final_round") %>%
  base::as.character() %>%
  stringr::str_split("class") %>%
  base::unlist() %>%
  stringr::str_subset("correct_response") %>%
  stringr::str_extract("(?<=&gt;)(.*)(?=&lt;/em&gt;)") %>%
  stringr::str_to_title()

final_answer_current
#> [1] NA
final_answer_new
#> [1] "The Castrati"

Created on 2020-02-26 by the reprex package (v0.3.0)

k5cents commented 4 years ago

LGTM, thanks.