JosiahParry / genius

Easily access song lyrics from Genius in a tibble.
Other
122 stars 18 forks source link

genius_album() and add_genius() having an issue with generating urls #27

Closed manandamoth closed 5 years ago

manandamoth commented 5 years ago

I've been running into an issue with both the genius_album() and add_genius() functions when trying to scrape lyrics from albums. Below is the

library(tidyverse)
library(itdytext)
library(genius)

InOurLifetime <- genius_tracklist(artist = "8Ball and MJG", album = "In Our Lifetime", 
                              info = "simple")

south_late90albums <- tribble(
  ~artist, ~album,
  "Goodie Mob", "Soul Food",
  "Master P", "Ice Cream Man",
  "Outkast", "ATLiens",
  "Missy Elliott", "Supa Dupa Fly",
  "Outkast", "Aquemini",
  "8Ball & MJG", "In Our Lifetime",
  "Hot Boys", "Guerilla Warfare",
  "Missy Elliott", "Da Real World"
)%>%
  add_genius(artist, album) %>%
  unnest_tokens(word, lyric)

 # Both return the same error: Warning message:
 # In request_GET(session, url) : Not Found (HTTP 404).
 # However, all the other albums in south_late90albums ran fine.

The one album that failed was 8Ball & MJG - In Our Lifetime. I think it has something to do with the ampersand interfering with the url generation, but I'm not certain.

JosiahParry commented 5 years ago

Thank you! This has been solved in issue https://github.com/JosiahParry/genius/commit/df6fa9c9697eea15516690b501fa671879038cf9.

I'm not sure if this is the best solution. But what I have done, is allow genius_album() to continue to work even when it encounters a broken url from the output of genius_tracklist(). You will see a 404 warning, but I have not been able to figure out how to inform you of which track it is.

When looking at the output, you will notice that songs with missing lyrics will have a row of NAs.