AndreaCirilloAC / updateR

update your R version in a breeze ( on OSX) √
Other
143 stars 24 forks source link

R 3.4.0 #11

Closed RobertMyles closed 7 years ago

RobertMyles commented 7 years ago

I just tried to update to the new version of R and I got:

Error in .[[2]] : subscript out of bounds, which is coming from:

 file <- xml2::read_html(page_source) %>%
    rvest::html_nodes("h1+ p a+ a , table:nth-child(8) tr:nth-child(1) td > a") %>%
    rvest::html_text() %>% strsplit("\n", fixed = TRUE) %>%
    .[[2]]

This works, and it could also be adapted to search for older versions:

css <- "body > table"

file <- xml2::read_html(page_source) %>%
    rvest::html_nodes(css) %>% 
    rvest::html_text() %>% 
    stringr::str_extract_all(pattern = "^[:print:]*\\.pkg") %>%
    .[[1]]

It does include a stringr dependency, though.

AndreaCirilloAC commented 7 years ago

Thank you as usual, @robertmyles. What made the previous solution not working? What did they change into the page layout? It could be useful to figure out to imagine if it can happen again.

AndreaCirilloAC commented 7 years ago

@RobertMyles I have tested the code and checked the difference. I don't think stringr dependance is a problem here, even because I couldn't find any easy workaround. if you push a pull request I will be happy to accept it. thank you as usual.

RobertMyles commented 7 years ago

Yeah, it seems to be a reformatting of the web page from which we're scraping the info(I'm guessing it's this). For the moment, we can use this workaround, but I wonder if there is a more robust way to do it...

AndreaCirilloAC commented 7 years ago

yeah, probably the R community will come to our help! But in the meanwhile I would go that way...