ewenme / geniusr

work with data & lyrics from Genius
https://ewenme.github.io/geniusr/
Other
50 stars 15 forks source link

geniusr: "Error in bind_rows_(x, .id)" #1

Closed MartinPons closed 5 years ago

MartinPons commented 6 years ago

Hi :)

I've begun to use the geniusr package, and I think it's great!

However I'm returned an error when I try to scrap lyrics for some artists songs.

Here it is an example:


library(geniusr)
library(tidyverse)

fortune <- search_song(search_term = "A fortune in lies", 
                   access_token = genius_token()) %>% 
  filter(artist_name == "Dream Theater")

scrape_lyrics_id(song_id = fortune$song_id[1], access_token = genius_token())

This is the error returned

Error in bind_rows_(x, .id) : 
  Argument 5 is a list, must contain atomic vectors

This is the output for the R.version command in my computer

platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          3                           
minor          4.3                         
year           2017                        
month          11                          
day            30                          
svn rev        73796                       
language       R                           
version.string R version 3.4.3 (2017-11-30)
nickname       Kite-Eating Tree  
vspat commented 6 years ago

Hi there! I know it's quite a late reply, but I had this same issue and found a workaround. It seems that the problem occurs for certain songs because lyrics are retrieved using the song url from the meta function. If your song has the same title as others, the function encounters a list of tracks and fails.

If we switch this to use https://genius.com/song_id instead, then we can adapt some of the lyric scraping code and get what we need.

lyric_scrape2 <- function(songid){
# changed url
session <- suppressWarnings(read_html(paste0("https://genius.com/",songid)))
lyrics <- html_nodes(session1,".lyrics p")
xml_find_all(lyrics, ".//br") %>% xml_add_sibling("p","\n")
xml_find_all(lyrics, ".//br") %>% xml_remove()
lyrics <- unlist(html_text(lyrics))
lyrics <- as.character(lyrics)
# changed gsub parameters
lyrics <- gsub("[\r\n]", " ", lyrics) 
lyrics <- gsub( " *\\[.*?\\] *", " ", lyrics)
return(lyrics)
}

Hope this helps anyone viewing this in the future!

MartinPons commented 5 years ago

Hi vspat

Thanks for your reply (alsoa late respone).

I'm not very verse in html or scrapping, but it seems that the same url contains more can contain more than one id for the songs.

Thanks for your alternative function, it doesn't return an error. However, if i pass one of the id for the "fortune" data.frame in my example, it doesn't return the lyrics I expected it to return. Don't know if I'm doing something wrong, ot "songid" refers to other id.

Thanks anyway :)

ewenme commented 5 years ago

this should have been fixed in #6. You can install the latest dev version via devtools::install_github('ewenme/geniusr'). Thanks for filing this and I am planning to take better care of my packages this yr (Y)